[jira] [Updated] (ACCUMULO-4446) Add info log line in `getMonitorLock()`
[ https://issues.apache.org/jira/browse/ACCUMULO-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4446: - Labels: pull-request-available (was: ) > Add info log line in `getMonitorLock()` > > > Key: ACCUMULO-4446 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4446 > Project: Accumulo > Issue Type: Improvement > Components: monitor >Reporter: Srikanth Viswanathan >Assignee: Luis Tavarez >Priority: Trivial > Labels: pull-request-available > Fix For: 1.7.3, 1.8.1, 2.0.0 > > Time Spent: 8.5h > Remaining Estimate: 0h > > When the monitor starts up and hangs on acquiring the monitor lock (for > whatever reason, perhaps there's another monitor instance already holding the > lock), there is no suggestion in the logs that the monitor is waiting on a > lock. An info log would be appropriate here, along with a debug line for each > failed attempt. > Ref: > https://github.com/apache/accumulo/blob/master/server/monitor/src/main/java/org/apache/accumulo/monitor/Monitor.java#L601 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ACCUMULO-1256) Add trash can for deleted tables
[ https://issues.apache.org/jira/browse/ACCUMULO-1256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-1256: - Labels: gsoc2013 mentor newbie pull-request-available (was: gsoc2013 mentor newbie) > Add trash can for deleted tables > > > Key: ACCUMULO-1256 > URL: https://issues.apache.org/jira/browse/ACCUMULO-1256 > Project: Accumulo > Issue Type: New Feature >Reporter: Keith Turner >Priority: Major > Labels: gsoc2013, mentor, newbie, pull-request-available > Attachments: ACCUMULO-1256-proposal-01.html, > ACCUMULO-1256-proposal-01.txt > > > It may be useful to provide an optional trash feature. If this feature were > enabled, then when a table is deleted it would go into the trash can. Tables > that had been in the trash for a while could would eventually be deleted. > Tables could be undeleted from the trash can. > What would the API and shell commands look like? How would multiple tables > in the trash can with the same name be handled in the API? Would/should per > table properties and pertable permissions be preserved? Should these tables > in the trash can show up in the monitor in some way? -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (ACCUMULO-1416) FileSystemMonitor isn't necessary anymore?
[ https://issues.apache.org/jira/browse/ACCUMULO-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-1416: - Labels: pull-request-available (was: ) > FileSystemMonitor isn't necessary anymore? > -- > > Key: ACCUMULO-1416 > URL: https://issues.apache.org/jira/browse/ACCUMULO-1416 > Project: Accumulo > Issue Type: Bug > Components: tserver >Reporter: John Vines >Priority: Major > Labels: pull-request-available > > With the removal of direct walogs, do we still need tservers monitoring the > filesystem and dying if things go wonky? Just had it happen and the death of > the tserver didn't seem necessary, especially with multiple disks. > Relevant stack trace- > {code} > 2013-05-14 08:54:55,693 [util.FileSystemMonitor] FATAL: Exception while > checking mount points, halting process > java.lang.Exception: Filesystem /data/04 switched to read only > at > org.apache.accumulo.server.util.FileSystemMonitor.checkMounts(FileSystemMonitor.java:134) > at > org.apache.accumulo.server.util.FileSystemMonitor$1.run(FileSystemMonitor.java:91) > at java.util.TimerThread.mainLoop(Timer.java:534) > at java.util.TimerThread.run(Timer.java:484) > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (ACCUMULO-3455) ShellOptionsJC should extend ClientOpts
[ https://issues.apache.org/jira/browse/ACCUMULO-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-3455: - Labels: newbie pull-request-available (was: newbie) > ShellOptionsJC should extend ClientOpts > --- > > Key: ACCUMULO-3455 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3455 > Project: Accumulo > Issue Type: Improvement > Components: shell >Reporter: Josh Elser >Priority: Major > Labels: newbie, pull-request-available > > ShellOptionsJC duplicates a large portion of the functionality that > ClientOpts already provides. It should extend ClientOpts and add whatever > extra functionality it needs. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (ACCUMULO-4867) Java Collections Improvements in DefaultLoadBalancer
[ https://issues.apache.org/jira/browse/ACCUMULO-4867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4867: - Labels: pull-request-available (was: ) > Java Collections Improvements in DefaultLoadBalancer > > > Key: ACCUMULO-4867 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4867 > Project: Accumulo > Issue Type: Improvement > Components: server-base >Reporter: David Mollitor >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (ACCUMULO-810) Add authorizations to continuous ingest test
[ https://issues.apache.org/jira/browse/ACCUMULO-810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-810: Labels: pull-request-available (was: ) > Add authorizations to continuous ingest test > > > Key: ACCUMULO-810 > URL: https://issues.apache.org/jira/browse/ACCUMULO-810 > Project: Accumulo > Issue Type: Improvement >Reporter: Keith Turner >Assignee: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 1.4.2, 1.5.0 > > > It would be nice if continuous ingest could be configured to set column > visibility when writing and auths when reading. I am thinking of letting the > user specify two configuration files. A visibility file and auth file. When > writing data random line from the visibility file would be used for an entire > linked list. Walkers would choose a random line from the auths file when > started. For the verify map reduce job, let the user configure a set of > auths. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4410) Master didn't not resume balancing after administrative tserver shutdown
[ https://issues.apache.org/jira/browse/ACCUMULO-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4410: - Labels: pull-request-available (was: ) > Master didn't not resume balancing after administrative tserver shutdown > > > Key: ACCUMULO-4410 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4410 > Project: Accumulo > Issue Type: Bug > Components: master >Affects Versions: 1.8.0 >Reporter: Josh Elser >Assignee: Ivan Bella >Priority: Critical > Labels: pull-request-available > Fix For: 2.0.0 > > > I realized that I misconfigured a property, so, I started manually stopping > each tabletserver (using {{accumulo admin stop }}). > This worked as intended, the tablets were migrated and the tserver was > stopped: > {noformat} > 2016-08-17 15:24:20,871 [master.EventCoordinator] INFO : Tablet Server > shutdown requested for > jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c] > 2016-08-17 15:24:20,991 [master.Master] DEBUG: FATE op shutting down > jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c] finished > {noformat} > However, after this point, the master did not resume balancing: > {noformat} > 2016-08-17 15:24:31,024 [master.Master] DEBUG: Finished gathering information > from 4 servers in 0.02 seconds > 2016-08-17 15:24:31,024 [master.Master] DEBUG: not balancing while shutting > down servers [jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c]] > 2016-08-17 15:24:36,831 [replication.WorkDriver] DEBUG: Sleeping 3 ms > before next work assignment > 2016-08-17 15:24:41,074 [master.Master] DEBUG: Finished gathering information > from 4 servers in 0.05 seconds > 2016-08-17 15:24:41,083 [master.Master] DEBUG: not balancing while shutting > down servers [jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c]] > 2016-08-17 15:24:51,134 [master.Master] DEBUG: Finished gathering information > from 4 servers in 0.05 seconds > 2016-08-17 15:24:51,135 [master.Master] DEBUG: not balancing while shutting > down servers [jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c]] > {noformat} > Even after I brought a new tserver online on that host, the master still did > not resume balancing: > {noformat} > 2016-08-17 15:25:53,015 [master.Master] INFO : New servers: > [jelser-accumulo-180-4.openstacklocal:54722[2568579a5c3006e]] > 2016-08-17 15:25:53,026 [master.EventCoordinator] INFO : There are now 5 > tablet servers > 2016-08-17 15:25:53,096 [master.Master] DEBUG: Finished gathering information > from 5 servers in 0.06 seconds > 2016-08-17 15:25:53,109 [master.Master] DEBUG: not balancing while shutting > down servers [jelser-accumulo-180-4.openstacklocal:59540[1568579a4b4004c]] > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-705) Batch Scanner needs timeout
[ https://issues.apache.org/jira/browse/ACCUMULO-705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-705: Labels: pull-request-available (was: ) > Batch Scanner needs timeout > --- > > Key: ACCUMULO-705 > URL: https://issues.apache.org/jira/browse/ACCUMULO-705 > Project: Accumulo > Issue Type: New Feature > Components: client >Reporter: Keith Turner >Assignee: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 1.5.0 > > > The batch scanner needs a user configurable time out. When the batch scanner > is used to query lots of tablet in parallel, if one tablet or tablet server > is unavailable for some reason it will cause the scan to hang indefinitely. > Users need more control over this behavior. > It seems like the batch scanner could behave in one of the following ways : > * Read as much data as possible, then throw an exception when a tablet or > tablet server has timed out > * Throw an exception as soon as a tablet or tablet server times out, even > if data could still be read from other tablets successfully. > The timeout can default to max long to preserve the current behavior. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4818) Upgrade to checkstyle 8+
[ https://issues.apache.org/jira/browse/ACCUMULO-4818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4818: - Labels: newbie pull-request-available (was: newbie) > Upgrade to checkstyle 8+ > > > Key: ACCUMULO-4818 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4818 > Project: Accumulo > Issue Type: Improvement > Components: build >Reporter: Christopher Tubbs >Assignee: Christopher Tubbs >Priority: Major > Labels: newbie, pull-request-available > Fix For: 2.0.0 > > > Working on for ACCUMULO-4817, I noticed that our checkstyle version could be > updated. However, this will involve some changes to our checkstyle > configuration, since some of that is no longer compatible with the latest > version of checkstyle (8.8). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4496) Upgrade fails due to outstanding Fate operations, refers to nonexistant explaination of how to handle
[ https://issues.apache.org/jira/browse/ACCUMULO-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4496: - Labels: pull-request-available (was: ) > Upgrade fails due to outstanding Fate operations, refers to nonexistant > explaination of how to handle > - > > Key: ACCUMULO-4496 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4496 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.7.2 >Reporter: John Vines >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > {code}2016-10-11 15:24:30,023 [server.Accumulo] FATAL: Problem verifying Fate > readiness > org.apache.accumulo.core.client.AccumuloException: Aborting upgrade because > there are outstanding FATE transactions from a previous Accumulo version. > Please see the README document for instructions on w > hat to do under your previous version. > at > org.apache.accumulo.server.Accumulo.abortIfFateTransactions(Accumulo.java:317) > at org.apache.accumulo.master.Master.upgradeZookeeper(Master.java:343) > at org.apache.accumulo.master.Master.setMasterState(Master.java:283) > at org.apache.accumulo.master.Master.getMasterLock(Master.java:1372) > at org.apache.accumulo.master.Master.run(Master.java:1113) > at > com.sqrrl.analytics.server.admin.SqrrlMasterStart$MasterShim.run(SqrrlMasterStart.java:99) > at java.lang.Thread.run(Thread.java:745){code} > However, there is no mention of this situation in README.md, nor in the > UPGRADING.md. Not sure what the appropriate response is to this situation, > but we should document it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4855) Parameter error in start-ingest.sh
[ https://issues.apache.org/jira/browse/ACCUMULO-4855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4855: - Labels: pull-request-available (was: ) > Parameter error in start-ingest.sh > -- > > Key: ACCUMULO-4855 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4855 > Project: Accumulo > Issue Type: Bug > Components: scripts, test >Affects Versions: 1.9.0 >Reporter: Norbert Kalmar >Priority: Major > Labels: pull-request-available > > I was trying to run continuous ingest test, but run into an error: > In start-ingest.sh, DEBUG_OPT is set to use -debug, but that is to turn logs > on-off. > When calling pssh command, we use it to set where to log. It will not run, > and in the .out log it will just write the usage: > Usage: org.apache.accumulo.test.continuous.ContinuousIngest [options] > Options: > [..] > --debug >turn on TRACE-level log messages >Default: false > --debugLog >file to write debugging output -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4854) DfsLogger exceptions are too verbose
[ https://issues.apache.org/jira/browse/ACCUMULO-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4854: - Labels: pull-request-available (was: ) > DfsLogger exceptions are too verbose > > > Key: ACCUMULO-4854 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4854 > Project: Accumulo > Issue Type: Improvement > Components: tserver >Affects Versions: 1.9.0, 2.0.0 >Reporter: Ivan Bella >Assignee: Ivan Bella >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > The DfsLogger is constantly spewing exceptions to the logs when the logs are > simply being closed. This is a normal situation when wal logs roll over. > The exceptions should not be showing as warnings in the accumulo monitor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4629) Seeking in timestamp range is slow
[ https://issues.apache.org/jira/browse/ACCUMULO-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4629: - Labels: pull-request-available (was: ) > Seeking in timestamp range is slow > -- > > Key: ACCUMULO-4629 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4629 > Project: Accumulo > Issue Type: Bug >Reporter: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > Fluo's internal schema uses the first 4 bits of the timestamp to store > different types of information per column. These first 4 bits divide the > timestamp space up into 16 ranges. Fluo has server side iterators that > consider information in one of these 16 ranges and then seek forward to > another of the 16 ranges. > Unfortunately, Accumulo's built in iterator that processes delete marker > makes seeking within the timestamps of a column an O(N^2) operation. This is > because of the way deletes work in Accumulo. A delete marker for timestamp X > in Accumulo deletes anything with a timestamp <= X. > When seeking to timestamp Y, the Accumulo iterator that handles deletes will > scan from MAX_LONG to Y looking for any deletes that may keep you from seeing > data at timestamp Y. The following example shows what the delete iterator > will do when a user iterator does some seeks. > * User iterator seeks to stamp 1,000,000. This causes the delete iter to > scan from MAX_LONG to 1,000,000 looking for delete markers. > * User iterator seeks to stamp 900,000. This causes the delete iter to scan > from MAX_LONG to 900,000 looking for delete markers. > * User iterator seeks to stamp 500,000. This causes the delete iter to scan > from MAX_LONG to 500,000 looking for delete markers. > So Fluo can seek efficiently, it has done some [serious > shenanigans|https://github.com/apache/incubator-fluo/blob/rel/fluo-1.0.0-incubating/modules/accumulo/src/main/java/org/apache/fluo/accumulo/iterators/TimestampSkippingIterator.java#L164] > using reflection to remove the DeleteIterator. The great work being done > on ACCUMULO-3079 will likely break this crazy reflection code. So I would > like to make a change to upstream Accumulo that allows efficient seeking in > the timestamp range. I have thought of the following two possible solutions. > * Make the DeleteIterator stateful so that it remember what ranges it has > scanned for deletes. I don't really like this solution because it will add > expense to every seek in Accumulo for an edge case. > * Make it possible to create tables with an exact delete behavior. Meaning a > delete for timestamp X will only delete an existing row column with that > exact timestamp. This option could only be chosen at table creation time and > could not be changed. For this delete behavior, the delete iterator doesnot > need to scan for every seek. > Are there other possible solutions? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-551) Experiment with multi-node batch writer
[ https://issues.apache.org/jira/browse/ACCUMULO-551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-551: Labels: pull-request-available (was: ) > Experiment with multi-node batch writer > --- > > Key: ACCUMULO-551 > URL: https://issues.apache.org/jira/browse/ACCUMULO-551 > Project: Accumulo > Issue Type: Task >Reporter: Keith Turner >Assignee: Keith Turner >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Accumulo has a batch writer that batches mutations by tablet server for > writes. This works well until there are alot of tablet servers being written > to at which point only a small amount of data is being sent to each tablet > server. Would it be better for the client to batch writes for multiple > tablet servers and send them to one server which writes directly to the > tablet servers? > One possible way to do this is to : > > * batch mutations by rack on the client > * send all of those mutations to one random tablet server on the rack > * have the random tablet server write to the other servers on the rack > This cuts down on the number of direct connections the client has to make. > Could have the following benefits. > * Tablet servers can keep connections open to other tablet servers. > * A write pipeline > Would be interesting to run some test and see how well this works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-551) Experiment with multi-node batch writer
[ https://issues.apache.org/jira/browse/ACCUMULO-551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-551: Labels: pull-request-available (was: ) > Experiment with multi-node batch writer > --- > > Key: ACCUMULO-551 > URL: https://issues.apache.org/jira/browse/ACCUMULO-551 > Project: Accumulo > Issue Type: Task >Reporter: Keith Turner >Assignee: Keith Turner >Priority: Major > Labels: pull-request-available > > Accumulo has a batch writer that batches mutations by tablet server for > writes. This works well until there are alot of tablet servers being written > to at which point only a small amount of data is being sent to each tablet > server. Would it be better for the client to batch writes for multiple > tablet servers and send them to one server which writes directly to the > tablet servers? > One possible way to do this is to : > > * batch mutations by rack on the client > * send all of those mutations to one random tablet server on the rack > * have the random tablet server write to the other servers on the rack > This cuts down on the number of direct connections the client has to make. > Could have the following benefits. > * Tablet servers can keep connections open to other tablet servers. > * A write pipeline > Would be interesting to run some test and see how well this works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4074) create user-configurable resource pools for different kinds of requests
[ https://issues.apache.org/jira/browse/ACCUMULO-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4074: - Labels: pull-request-available (was: ) > create user-configurable resource pools for different kinds of requests > --- > > Key: ACCUMULO-4074 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4074 > Project: Accumulo > Issue Type: Improvement > Components: client, tserver >Reporter: Eric Newton >Assignee: Keith Turner >Priority: Major > Labels: pull-request-available > > Complex queries and iterator stacks can sometimes run for long periods of > time. During that time, access to resources for shorter, simpler lookups can > be blocked. Use separate resource pools to allow for simpler queries to be > able to run regardless. This same mechanism could be used for the metadata > and root tables, too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-3510) Create mechanism to support priority based scheduling of read ahead tasks.
[ https://issues.apache.org/jira/browse/ACCUMULO-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-3510: - Labels: pull-request-available (was: ) > Create mechanism to support priority based scheduling of read ahead tasks. > --- > > Key: ACCUMULO-3510 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3510 > Project: Accumulo > Issue Type: Improvement > Components: tserver >Affects Versions: 1.6.0 >Reporter: marco polo >Assignee: marco polo >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > > I have many cases where ScanSessions will consume resources that I otherwise > want shorter running scans to utilize. In some cases, a scan may continue for > hours, while a short running scan may come in and execute quickly. As a > result, I want to be able to adjust the priority of these scan sessions. > I have a patch which is forthcoming, that breaks Session out of TabletServer > and replaces the queue in the readAheadThreadPool with a priority pool. The > comparator I have created as a proof of concept, which can be adjustable, > reduces the priority of the oldest scan. Using an aging technique, we > guarantee execute of these older running scans based upon the previous run > time. As a result, we give preference to newer scans. If they execute > quickly, older scans will have an inherent rise in priority. If they also > take a while, their priority will be reduced and incoming scans will yet > again be given a greater priority with the intent (and/or hope ) their > execution will be faster. > Priority should be configurable based on the desired Session Comparator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-2537) Add @Override Annotations
[ https://issues.apache.org/jira/browse/ACCUMULO-2537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-2537: - Labels: newbie pull-request-available (was: newbie) > Add @Override Annotations > - > > Key: ACCUMULO-2537 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2537 > Project: Accumulo > Issue Type: Task >Reporter: Mike Drob >Priority: Trivial > Labels: newbie, pull-request-available > > We should add @Override annotations to methods in our code. This can be > detected and fixed automatically using eclipse. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4813) Accepting mapping file for bulk import
[ https://issues.apache.org/jira/browse/ACCUMULO-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4813: - Labels: pull-request-available (was: ) > Accepting mapping file for bulk import > -- > > Key: ACCUMULO-4813 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4813 > Project: Accumulo > Issue Type: Sub-task >Reporter: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > During bulk import, inspecting files to determine where they go is expensive > and slow. In order to spread the cost, Accumulo has an internal mechanism to > spread the work of inspecting files to random tablet servers. Because this > internal process takes time and consumes resources on the cluster, users want > control over it. The best way to give this control may be to externalize it > by allowing bulk imports to have a mapping file. This mapping file would > specify the ranges where files should be loaded. If Accumulo provided API to > help produce this file, then that work could be done in Map Reduce or Spark. > This would give users all the control they want over when and where this > computation is done. This would naturally fit in the process used to create > the bulk files. > To make bulk import fast this mapping file should have the following > properties. > * Key in file is a range > * Value in file is a list of files > * Ranges are non overlapping > * File is sorted by range/key > * Has a mapping for every non-empty file in the bulk import directory. > If Accumulo provides APIs to do the following operation, then producing the > file could written as a map/reduce job. > * For a given rfile produce a list of row ranges where the file should be > loaded. These row ranges would be based on tablets. > * Merge row range,list of file pairs > * Serialize row range,list of files pairs > With a mapping file, the bulk import algorithm could be written as follows. > This could all be executed in the master with no need to run inspection task > on random tablet servers. > * Sanity check file > ** Ensure in sorted order > ** Ensure ranges are non-overlapping > ** Ensure each file in directory has at least one entry in file > ** Ensure all splits in the file exist in the table. > * Since file is sorted can do a merged read of file and metadata table, > looping over the following operations for each tablet until all files are > loaded. > ** Read the loaded files for the tablet > ** Read the files to load for the range > ** For any files not loaded, send an async load message to the tablet server > The above algorithm can just keep scanning the metadata table and sending > async load messages until the bulk import is complete. Since the load > messages are async, the bulk load could of a large number of files could > potentially be very fast. > The bulk load operation can easily handle the case of tablets splitting > during the operation by matching a single range in the file to multiple > tablets. However attempting to handle merges would be a lot more tricky. It > would probably be simplest to fail the operation if a merge is detected. The > nice thing is that this can be done in a very clean way. Once the bulk > import operation has the table lock, merges can not happen. So after getting > the table lock the bulk import operation can ensure all splits in the file > exist in the table. The operation can abort if the condition is not met > before doing any work. If this condition is not met, it indicates a merge > happened between generating the mapping file an doing the bulk import. > Hopefully the mapping file plus the algorithm that sends async load messages > can dramatically speed up bulk import operations. This may lessen the need > for other things like prioritizing bulk import. To measure this, it would be > very useful create a bulk import performance test that can create many files > with very little data and measure the time it takes load them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4791) setshelliter command has incorrect usage statement
[ https://issues.apache.org/jira/browse/ACCUMULO-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4791: - Labels: pull-request-available (was: ) > setshelliter command has incorrect usage statement > -- > > Key: ACCUMULO-4791 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4791 > Project: Accumulo > Issue Type: Bug > Components: shell >Reporter: Mark Owens >Assignee: Mark Owens >Priority: Minor > Labels: pull-request-available > Fix For: 1.9.0, 2.0.0 > > > When using the Accumulo shell, the setshelliter command should not require a > table name and the usage indicates that a table name is optional, as > expected. But the command will fail unless an existing table name is > supplied. Most likely due to the fact that setshelliter extends setiter, > which does require a valid table name. > The setshelliter command should be updated to work as the current usage > indicates, i.e., no table name should be required. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4850) Provide grafana dashboard and report more metrics
[ https://issues.apache.org/jira/browse/ACCUMULO-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4850: - Labels: pull-request-available (was: ) > Provide grafana dashboard and report more metrics > - > > Key: ACCUMULO-4850 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4850 > Project: Accumulo > Issue Type: Improvement >Reporter: Mike Walch >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0, 2.0.0 > > > Improve metrics by providing grafana dashboard and instructions for setting > up metrics server. Also, include more metrics in reporting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4615) ThreadPool timeout when checking tserver stats is confusing
[ https://issues.apache.org/jira/browse/ACCUMULO-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4615: - Labels: pull-request-available (was: ) > ThreadPool timeout when checking tserver stats is confusing > --- > > Key: ACCUMULO-4615 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4615 > Project: Accumulo > Issue Type: Bug > Components: master >Affects Versions: 1.8.1 >Reporter: Michael Wall >Assignee: Jeff Schmidt >Priority: Minor > Labels: pull-request-available > Fix For: 1.9.0, 2.0.0 > > > If it takes longer than the configured time to gather information from all > the tablet servers, the thread pool stops and processing continues with > whatever has been collected. Code is > https://github.com/apache/accumulo/blob/1.8/server/master/src/main/java/org/apache/accumulo/master/Master.java#L1120, > default timeout is 6s. Does not appear to be an issue prior to 1.8. > Best case, this was really confusing. The monitor page would have 30 > tservers, then 5 tservers. Didn't really see any other negative effects, no > migrations and no balancing appeared to be affected. Worse case though, I > missed something and the master is making decisions based on incomplete > information. > [~dlmar...@comcast.net] please add more info if needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4847) Fix Retry API
[ https://issues.apache.org/jira/browse/ACCUMULO-4847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4847: - Labels: pull-request-available (was: ) > Fix Retry API > - > > Key: ACCUMULO-4847 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4847 > Project: Accumulo > Issue Type: Bug > Components: fate >Reporter: Christopher Tubbs >Assignee: Christopher Tubbs >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.4, 1.9.0, 2.0.0 > > > ACCUMULO-4834 exposed a merge bug from ACCUMULO-4777. ACCUMULO-4777 added a > parameter to the two Retry constructors, but since all the parameters are > longs, this masked what should have been a compile failure by causing calls > to the longer constructor to now use the shorter one. > The side effect is that any calls with "unlimited number of retries", would > now be interpreted as "unlimited wait time between retries"... causing the > first failure to sleep for Long.MAX_LONG millis before retrying. > I propose a refactor with a builder so the parameters cannot be confused for > one another, since they all have the same long type. This is internal code, > so there is no public API change. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-3389) Iterator Names can't contain dots
[ https://issues.apache.org/jira/browse/ACCUMULO-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-3389: - Labels: pull-request-available (was: ) > Iterator Names can't contain dots > - > > Key: ACCUMULO-3389 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3389 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.6.0 >Reporter: John Vines >Assignee: Mark Owens >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0, 2.0.0 > > > Attempting to attach an interator who's name includes dots results in > messages on the remote end from IteratorUtil - "Unrecognizable option: > ITERNAME". No errors no warnings, when they are attempted to be attached. > They get added to the config without issue. > But when you do something like listiter, they don't show up and then warnings > appear in the logs/on the monitor. > We should either: > A. allow iterators with dots in the name > B. doc that this isn't allowed and check server side when they are set. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4835) Client should throw TableNotFoundException
[ https://issues.apache.org/jira/browse/ACCUMULO-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4835: - Labels: pull-request-available (was: ) > Client should throw TableNotFoundException > -- > > Key: ACCUMULO-4835 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4835 > Project: Accumulo > Issue Type: Bug > Components: client >Affects Versions: 1.7.4, 1.9.0 >Reporter: Michael Miller >Assignee: Michael Miller >Priority: Minor > Labels: pull-request-available > > I saw this misleading exception throw while running randomwalk during the > concurrent OfflineTable node: > Caused by: org.apache.accumulo.core.client.AccumuloException: Unexpected > table state g0 DELETING != ONLINE > at > org.apache.accumulo.core.client.impl.TableOperationsImpl.waitForTableStateTransition(TableOperationsImpl.java:1072) > at > org.apache.accumulo.core.client.impl.TableOperationsImpl.online(TableOperationsImpl.java:1240) > at > org.apache.accumulo.test.randomwalk.concurrent.OfflineTable.visit(OfflineTable.java:47) > The table was in the process of being deleted so a TableNotFoundException > would be more helpful. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4836) Tables do not always wait for online or offline
[ https://issues.apache.org/jira/browse/ACCUMULO-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4836: - Labels: pull-request-available (was: ) > Tables do not always wait for online or offline > --- > > Key: ACCUMULO-4836 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4836 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.7.3 >Reporter: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 1.7.4, 1.9.0, 2.0.0 > > > While investigating why TabletStateChangeIteratorIT it was discovered that > online table with wait=true does not always wait. The test relied on this > API to wait and that is why it was failing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4832) Seeing warnings when write ahead log changes.
[ https://issues.apache.org/jira/browse/ACCUMULO-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4832: - Labels: pull-request-available (was: ) > Seeing warnings when write ahead log changes. > - > > Key: ACCUMULO-4832 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4832 > Project: Accumulo > Issue Type: Bug >Reporter: Keith Turner >Assignee: Ivan Bella >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.4, 1.9.0, 2.0.0 > > > While running continuous ingest against 1.7.4-rc0 I saw a lot of warning like > the followng. > {noformat} > 2018-02-26 17:51:58,189 [log.TabletServerLogger] WARN : Logs closed while > writing, retrying attempt 1 (suppressing retry messages for 18ms) > 2018-02-26 17:51:58,724 [log.TabletServerLogger] WARN : Logs closed while > writing, retrying attempt 1 (suppressing retry messages for 18ms) > 2018-02-26 17:51:58,940 [log.TabletServerLogger] WARN : Logs closed while > writing, retrying attempt 1 (suppressing retry messages for 18ms) > 2018-02-26 17:51:59,226 [log.TabletServerLogger] WARN : Logs closed while > writing, retrying attempt 1 (suppressing retry messages for 18ms) > 2018-02-26 17:51:59,227 [log.TabletServerLogger] WARN : Logs closed while > writing, retrying attempt 1 (suppressing retry messages for 18ms) > 2018-02-26 17:51:59,227 [log.TabletServerLogger] WARN : Logs closed while > writing, retrying attempt 1 (suppressing retry messages for 18ms) > {noformat} > > The warning are generated by [TabletServerLogger.java line > 341|https://github.com/apache/accumulo/blob/4e91215f101362ef206e9f213b4d8d12b3f6e0e2/server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java#L341] > when a write ahead log is closed. Write ahead logs are closed as part of > normal operations as seen on [TabletServerLogger.java line > 386|https://github.com/apache/accumulo/blob/4e91215f101362ef206e9f213b4d8d12b3f6e0e2/server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java#L386]. > There should not be a warning when this happens. This is caused by changes > made for ACCUMULO-4777. Before these changes this event was logged at debug. > At this time, these changes have not been released. It would be nice to fix > this before releasing 1.7.4 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-3807) Log Recovery Copy/Sort progress exceeds 100%
[ https://issues.apache.org/jira/browse/ACCUMULO-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-3807: - Labels: newbie pull-request-available (was: newbie) > Log Recovery Copy/Sort progress exceeds 100% > > > Key: ACCUMULO-3807 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3807 > Project: Accumulo > Issue Type: Improvement > Components: monitor >Reporter: Josh Elser >Priority: Minor > Labels: newbie, pull-request-available > Fix For: 2.0.0 > > > I regularly notice that the Copy/Sort progress bar for log recovery on the > monitor exceeds 100%. We should either > * Find out why this exceeds 100% and fix the computation > or > * Just cap the value so that it just reports 100% and doesn't exceed it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4826) Support Hadoop 3
[ https://issues.apache.org/jira/browse/ACCUMULO-4826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4826: - Labels: pull-request-available (was: ) > Support Hadoop 3 > > > Key: ACCUMULO-4826 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4826 > Project: Accumulo > Issue Type: New Feature >Reporter: Josh Elser >Assignee: Josh Elser >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4824) Jersey dependency conflict is causing ITs to fail
[ https://issues.apache.org/jira/browse/ACCUMULO-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4824: - Labels: pull-request-available (was: ) > Jersey dependency conflict is causing ITs to fail > - > > Key: ACCUMULO-4824 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4824 > Project: Accumulo > Issue Type: Task > Components: test >Affects Versions: 2.0.0 >Reporter: Mike Walch >Assignee: Mike Walch >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > Several ITs require Jersey 1.x for Hadoop minicluster but it is excluded to > test Monitor web resource validation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4820) Cleanup code for 2.0
[ https://issues.apache.org/jira/browse/ACCUMULO-4820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4820: - Labels: pull-request-available (was: ) > Cleanup code for 2.0 > > > Key: ACCUMULO-4820 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4820 > Project: Accumulo > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Michael Miller >Assignee: Michael Miller >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > > Running IntelliJ code inspect picks up a lot of minor code clean up fixes > that we would be nice to fix sooner rather than later. > Java 5 Updates > - unnecessary boxing > - use foreach where possible > Java 7 Updates (these should definitely be fixed) > - Explicit type can be replaced with <> (aka diamond operator) > - Identical catch branches in try > - try finally replaceable with try with resources > Java 8 Updates (these would be nice, but maybe some prefer older way?) > - Replace anonymous types with lambda > - Replace code with new single Map method call > - Replace Collections.sort () with List.sort() > Other Misc performance Issues picked up by the inspector. I think these > should definitely be fixed but perhaps a sub ticket -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4788) Improve Thrift Transport pool
[ https://issues.apache.org/jira/browse/ACCUMULO-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4788: - Labels: pull-request-available (was: ) > Improve Thrift Transport pool > - > > Key: ACCUMULO-4788 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4788 > Project: Accumulo > Issue Type: Improvement >Reporter: Keith Turner >Assignee: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > Accumulo has a pool of recently opened connections to tablet servers. When > connecting to tablet servers, this pool is checked first. The pool is built > around a map of list. There are two problems with this pool : > * It has global lock around the map of list > * When trying to find a connection it does a linear search for a non > reserved connection (this is per tablet server) > Could possibly move to a model of having a list of unreserved connections and > a set of reserved connections per tablet server. Then to get a connection, > could remove from the unreserved list and add to the reserved set. This > would be a constant time operation. > For the locking, could move to a model of using a concurrent map and locking > per tserver instead of locking the entire map. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4801) Consider precomputing some client context fields
[ https://issues.apache.org/jira/browse/ACCUMULO-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4801: - Labels: pull-request-available (was: ) > Consider precomputing some client context fields > > > Key: ACCUMULO-4801 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4801 > Project: Accumulo > Issue Type: Improvement >Reporter: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > Currently each time a connection is requested from the the thrift transport > pool, three methods are called on client context to get ssl, sasl, and > timeout. These in turn call methods on configuration. This is showing up in > profiling as slow. I wonder if these could be precomputed in the client > context constructor. > > Also, repeatedly calling rpcCreds() on client context is showing up as slow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4799) In tablet server start scan authenticates twice
[ https://issues.apache.org/jira/browse/ACCUMULO-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4799: - Labels: pull-request-available (was: ) > In tablet server start scan authenticates twice > --- > > Key: ACCUMULO-4799 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4799 > Project: Accumulo > Issue Type: Improvement >Affects Versions: 1.7.3, 1.8.1 >Reporter: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0, 2.0.0 > > > The code that handles a start scan RPC call checks authentication twice. > Each call to authenticate takes a bit of time. It would be nice if it only > did it once. > At [TabletServer line > 479|https://github.com/apache/accumulo/blob/rel/1.8.1/server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java#L479] > a call to canScan is made which calls authenticate. Then at [TabletServer > line > 482|https://github.com/apache/accumulo/blob/rel/1.8.1/server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java#L482] > a call to check authorizations is made which also authenticates. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4814) Add links to Java classes in examples documentation
[ https://issues.apache.org/jira/browse/ACCUMULO-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4814: - Labels: pull-request-available (was: ) > Add links to Java classes in examples documentation > --- > > Key: ACCUMULO-4814 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4814 > Project: Accumulo > Issue Type: Task > Components: examples >Reporter: Mike Walch >Assignee: Mike Walch >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4804) Bulk Ingest example is broken
[ https://issues.apache.org/jira/browse/ACCUMULO-4804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4804: - Labels: pull-request-available (was: ) > Bulk Ingest example is broken > - > > Key: ACCUMULO-4804 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4804 > Project: Accumulo > Issue Type: Bug > Components: examples >Affects Versions: 2.0.0 >Reporter: Michael Miller >Assignee: Michael Miller >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > The bulk import > ([https://github.com/milleruntime/accumulo-examples/blob/master/docs/bulkIngest.md)] > example in the new 2.0 repo does not work on its own. When trying to run > the commands, you get the error: > The following option is required: [-c | --conf] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4789) Scans spend significant time constructing debug string.
[ https://issues.apache.org/jira/browse/ACCUMULO-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4789: - Labels: pull-request-available (was: ) > Scans spend significant time constructing debug string. > --- > > Key: ACCUMULO-4789 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4789 > Project: Accumulo > Issue Type: Improvement >Reporter: Keith Turner >Priority: Major > Labels: pull-request-available > > While profiling a Fluo test running lots of little scans, I noticed a string > builder operation showing up prominently in the profiling results. Below is > a link to the problematic code. Calling range toString was the most > expensive part followed by KeyExtent toString. > [https://github.com/apache/accumulo/blob/rel/1.7.3/core/src/main/java/org/apache/accumulo/core/client/impl/ThriftScanner.java#L405] > > I am not sure if we can change this in 1.7 and 1.8/1.9 because people may > rely on this for debugging. In 2.0 we may want to consider removing this (or > moving it inside the logging code block). > Also, while looking at this I noticed that some of the log statements called > String.format. Those should be placed in a if(llog.traceEnabled()) block. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4809) Session manager clean up can happen when lock held.
[ https://issues.apache.org/jira/browse/ACCUMULO-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4809: - Labels: pull-request-available (was: ) > Session manager clean up can happen when lock held. > --- > > Key: ACCUMULO-4809 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4809 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.7.3, 1.8.1 >Reporter: Keith Turner >Priority: Critical > Labels: pull-request-available > Fix For: 1.7.4, 1.9.0, 2.0.0 > > > While working on [PR #382|https://github.com/apache/accumulo/pull/382] for > ACCUMULO-4782 I noticed a significant concurrency bug. Before #382 their was > a single lock for the session manager. The session manager will clean up idle > sessions. This clean up should happen outside the session manager lock, > because all tserver read/write operation use the session manger so it should > be responsive. > The bug is the following. > * Both getActiveScansPerTable() and getActiveScans() lock the session > manager and then lock idleSessions. See [SessionManager line > 233|https://github.com/apache/accumulo/blob/rel/1.7.3/server/tserver/src/main/java/org/apache/accumulo/tserver/session/SessionManager.java#L233] > > * The sweep() method locks idleSessions and does cleanup while this lock is > held. [See SessionManager > 200|https://github.com/apache/accumulo/blob/rel/1.7.3/server/tserver/src/main/java/org/apache/accumulo/tserver/session/SessionManager.java#L200] > > Therefore it is possible for getActiveScansPerTable() or getActiveScans() to > lock the session manager and then block trying to lock idleSessions while > cleanup is happening in sweep(). This will block all access to the session > manager while cleanup happens. > The changes in #382 will fix this for 1.9.0 and 2.0.0. However I Am not sure > about backporting #382 to 1.7. A more targeted fix could be made for 1.7 or > #382 could be backported. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4782) With many threads scanning seeing lock contention on SessionManager
[ https://issues.apache.org/jira/browse/ACCUMULO-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4782: - Labels: pull-request-available (was: ) > With many threads scanning seeing lock contention on SessionManager > --- > > Key: ACCUMULO-4782 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4782 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.7.3, 1.8.1 >Reporter: Keith Turner >Priority: Major > Labels: pull-request-available > > While profiling many threads doing small scans against accumulo, lock > contention on the tablet servers SessionManager was high. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4413) Copy/Sort column on WAL recovery table should not exceed 100%
[ https://issues.apache.org/jira/browse/ACCUMULO-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4413: - Labels: newbie pull-request-available (was: newbie) > Copy/Sort column on WAL recovery table should not exceed 100% > - > > Key: ACCUMULO-4413 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4413 > Project: Accumulo > Issue Type: Bug >Reporter: Josh Elser >Assignee: Gergely Hajós >Priority: Trivial > Labels: newbie, pull-request-available > > The WAL recovery table on the monitor will often exceed 100% progress. We > should be able to do two things: > 1. Ensure this value does not exceed 100% progress as that is nonsensical > 2. Ensure that we do not report 100% when the copy-sort is still in progress -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4805) Seeing thread contention on FileManager
[ https://issues.apache.org/jira/browse/ACCUMULO-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4805: - Labels: pull-request-available (was: ) > Seeing thread contention on FileManager > --- > > Key: ACCUMULO-4805 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4805 > Project: Accumulo > Issue Type: Bug >Reporter: Keith Turner >Assignee: Keith Turner >Priority: Major > Labels: pull-request-available > > Accumulo had a tablet server wide cache of open files. Accessing this cache > obtains a global lock. In profiling, I am seeing contention on this lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4800) Preparse iterator configuration
[ https://issues.apache.org/jira/browse/ACCUMULO-4800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4800: - Labels: pull-request-available (was: ) > Preparse iterator configuration > --- > > Key: ACCUMULO-4800 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4800 > Project: Accumulo > Issue Type: Improvement >Affects Versions: 1.7.3, 1.8.1 >Reporter: Keith Turner >Assignee: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0, 2.0.0 > > > I am noticing that for small scans a good bit of time is spent parsing > iterator config. It would be nice to pre-parse iterator config and only > reparse when table config changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4802) Add kill command to accumulo-cluster script
[ https://issues.apache.org/jira/browse/ACCUMULO-4802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4802: - Labels: pull-request-available (was: ) > Add kill command to accumulo-cluster script > --- > > Key: ACCUMULO-4802 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4802 > Project: Accumulo > Issue Type: Improvement > Components: scripts >Reporter: Mike Walch >Assignee: Mike Walch >Priority: Major > Labels: pull-request-available > > Would helpful to be able to kill cluster quickly -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4798) Copying Stat in ZooCache is slow
[ https://issues.apache.org/jira/browse/ACCUMULO-4798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4798: - Labels: pull-request-available (was: ) > Copying Stat in ZooCache is slow > > > Key: ACCUMULO-4798 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4798 > Project: Accumulo > Issue Type: Improvement >Affects Versions: 1.7.3, 1.8.1 >Reporter: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0, 2.0.0 > > > The ZooKeeper cache code caches Zookeeper stats. When a stat is requested > from the cache it copies it. The ZK stat class offers no good way to copy > other than serialize and deserialize. The code currently does this and its > slow. All code in Accumulo only uses one field from stat, so it would be > much better to create a simple class that has this one field and can quickly > copy. > > The stat is used very frequently in the metadata cache code to check if a > tserver still holds its lock. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4797) TableParentConfiguration is unnecessary and cause performance problems
[ https://issues.apache.org/jira/browse/ACCUMULO-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4797: - Labels: pull-request-available (was: ) > TableParentConfiguration is unnecessary and cause performance problems > --- > > Key: ACCUMULO-4797 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4797 > Project: Accumulo > Issue Type: Improvement >Affects Versions: 1.7.3, 1.8.1 >Reporter: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 1.9.0, 2.0.0 > > > TableParentConfiguration is a shim class between Table and Namespace > configuration. It exist on the premise that a table may move between > namespaces. However this is currently not possible in Accumulo and would > require a major redesign the make it possible. This class always spend time > looking up a tables namespace id each time a table property is accessed. > However this Id will never changes. > Also should a tables namespace ever change, the current config code is not > correct because not all code looks up the namespace id. > TableParentConfiguration should be removed. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4709) Add size sanity checks to Mutations
[ https://issues.apache.org/jira/browse/ACCUMULO-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4709: - Labels: newbie pull-request-available (was: newbie) > Add size sanity checks to Mutations > --- > > Key: ACCUMULO-4709 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4709 > Project: Accumulo > Issue Type: Improvement >Reporter: Keith Turner >Assignee: Gergely Hajós >Priority: Major > Labels: newbie, pull-request-available > > Based in ACCUMULO-4708, it may be good to add size sanity checks to > Accumulo's Mutation data type. The first step would be to determine how > Mutation handles the following situations currently. > * Create a mutation and put lots of small entries where total size exceeds > 2GB > * Create a mutation and add a single entry where the total of all fields > exceeds 2GB, but no individual field exceeds 2GB -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4792) Improve crypto configuration checks
[ https://issues.apache.org/jira/browse/ACCUMULO-4792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4792: - Labels: pull-request-available (was: ) > Improve crypto configuration checks > --- > > Key: ACCUMULO-4792 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4792 > Project: Accumulo > Issue Type: Improvement >Reporter: Nick Felts >Assignee: Nick Felts >Priority: Minor > Labels: pull-request-available > > The crypto module class and secret key encryption strategy class should each > be checked to ensure that if one is set not-null, the other is also enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4772) Update Accumulo shell to utilize new NewTableConfiguration methods
[ https://issues.apache.org/jira/browse/ACCUMULO-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4772: - Labels: pull-request-available (was: ) > Update Accumulo shell to utilize new NewTableConfiguration methods > -- > > Key: ACCUMULO-4772 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4772 > Project: Accumulo > Issue Type: Improvement > Components: shell >Reporter: Mark Owens >Assignee: Mark Owens >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > > ACCUMULO-4732 adds the capability for NewTableConfiguration to preconfigure > iterators and locality groups prior to table creation. Update the Accumulo > shell to allow the same capability. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4787) Numerous leaked CLOSE_WAIT threads from TabletServer
[ https://issues.apache.org/jira/browse/ACCUMULO-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4787: - Labels: pull-request-available (was: ) > Numerous leaked CLOSE_WAIT threads from TabletServer > > > Key: ACCUMULO-4787 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4787 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.8.1 > Environment: * Ubuntu 14.04 > * HDFS 2.6.0 and 2.5.0 (in the middle of an upgrade cycle) > * ZooKeeper 3.4.6 > * Accumulo 1.8.1 > * HotSpot 1.8.0_121 >Reporter: Adam J Shook >Assignee: Adam J Shook >Priority: Minor > Labels: pull-request-available > > I'm running into an issue across all environments where TabletServers are > occupying a large number of ports in a CLOSED_WAIT state writing to a > DataNode at port 50010. I'm seeing numbers from around 12,000 to 20,000 > ports. In some instances, there were over 68k and it was affecting other > applications from getting a free port and they would fail to start (which is > how we found this in the first place). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4778) Resolving table name to table id is expensive
[ https://issues.apache.org/jira/browse/ACCUMULO-4778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4778: - Labels: pull-request-available (was: ) > Resolving table name to table id is expensive > - > > Key: ACCUMULO-4778 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4778 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.7.3, 1.8.1 >Reporter: Keith Turner >Assignee: Michael Miller >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > I was running a Fluo test application and profiling the tablet server and > Fluo worker. The Fluo worker does lots small scans against Accumulo. > Resolving table names to ids (which is done for each scan) was expensive > enough to make a significant showing in the profiling data. > I looked that the 1.8 code and it does the following to resolve a table name : > * reads over all cached table ids in zookeeper putting them in a treemap > * does a lookup in the treemap > Ideally the client code would keep a cache of name to id mappings and > invalidate them when something changes in zookeeper. The data in zookeeper > is stored by id, so it does need to be inverted to lookup by name. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-3361) CryptoTest creates files in /tmp/tmp/
[ https://issues.apache.org/jira/browse/ACCUMULO-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-3361: - Labels: pull-request-available (was: ) > CryptoTest creates files in /tmp/tmp/ > - > > Key: ACCUMULO-3361 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3361 > Project: Accumulo > Issue Type: Bug >Reporter: Christopher Tubbs >Assignee: Mark Owens >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 10m > Remaining Estimate: 0h > > The CryptoTest creates files {{/tmp/tmp/test.secret.key}} and > {{/tmp/tmp/.test.secret.key.crc}}. It should be creating these files in > {{target/}} instead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4779) Instantiating iterator accesses config in a very slow way
[ https://issues.apache.org/jira/browse/ACCUMULO-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4779: - Labels: pull-request-available (was: ) > Instantiating iterator accesses config in a very slow way > - > > Key: ACCUMULO-4779 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4779 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.7.3, 1.8.1 >Reporter: Keith Turner >Assignee: Keith Turner >Priority: Major > Labels: pull-request-available > Fix For: 1.7.4, 1.9.0, 2.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > I noticed this will looking at profiling data. When creating iterators all > configuration is put in a TreeMap just to get two keys out. > The following code > https://github.com/apache/accumulo/blob/rel/1.8.1/start/src/main/java/org/apache/accumulo/start/classloader/vfs/ContextManager.java#L118 > calls the following code > https://github.com/apache/accumulo/blob/rel/1.8.1/core/src/main/java/org/apache/accumulo/core/conf/AccumuloConfiguration.java#L153 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4781) Per scan logging is expensive
[ https://issues.apache.org/jira/browse/ACCUMULO-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4781: - Labels: pull-request-available (was: ) > Per scan logging is expensive > - > > Key: ACCUMULO-4781 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4781 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.7.3, 1.8.1 >Reporter: Keith Turner >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > While profiling Accumulo in a situation where many threads where doing small > scans it was noticed that per scan logging was expensive. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4784) Create builder methods for Connector to simplify client API
[ https://issues.apache.org/jira/browse/ACCUMULO-4784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4784: - Labels: pull-request-available (was: ) > Create builder methods for Connector to simplify client API > --- > > Key: ACCUMULO-4784 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4784 > Project: Accumulo > Issue Type: Improvement > Components: client >Affects Versions: 2.0.0 > Environment: Currently, Connector objects are created using > ZookeeperInstance. Client code would be cleaner if it was created using > builder methods in Connector. >Reporter: Mike Walch >Assignee: Mike Walch >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4780) Add overflow check to sequence number in CommitSession
[ https://issues.apache.org/jira/browse/ACCUMULO-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4780: - Labels: pull-request-available (was: ) > Add overflow check to sequence number in CommitSession > -- > > Key: ACCUMULO-4780 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4780 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.7.3, 1.8.1 >Reporter: Keith Turner >Assignee: Mark Owens >Priority: Major > Labels: pull-request-available > > CommitSession has an integer sequence number that could possibly overflow if > a tablet does 1 billion minor compactions on the same tablet server. It > would be nice to either change this to a long or check if the interger has > overflowed after incrementing. This problem was identified while looking int > ACCUMULO-4777, there are some comments there with background information. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ACCUMULO-4777) Root tablet got spammed with 1.8 million log entries
[ https://issues.apache.org/jira/browse/ACCUMULO-4777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4777: - Labels: pull-request-available (was: ) > Root tablet got spammed with 1.8 million log entries > > > Key: ACCUMULO-4777 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4777 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.8.1 >Reporter: Ivan Bella >Priority: Critical > Labels: pull-request-available > Fix For: 1.8.2, 2.0.0 > > > We had a tserver that was handling accumulo.metadata tablets that somehow got > into a loop where it created over 22K empty wal logs. There were around 70 > metadata tablets and this resulted in around 1.8 million log entries in added > to the accumulo.root table. The only reason it stopped creating wal logs is > because it ran out of open file handles. This took us many hours and cups of > coffee to clean up. > The log contained the following messages in a tight loop: > log.TabletServerLogger INFO : Using next log hdfs://... > tserver.TabletServfer INFO : Writing log marker for hdfs://... > tserver.TabletServer INFO : Marking hdfs://... closed > log.DfsLogger INFO : Slow sync cost ... > ... > Unfortunately we did not have DEBUG turned on so we have no debug messages. > Tracking through the code there are three places where the > TabletServerLogger.close method is called: > 1) via resetLoggers in the TabletServerLogger, but nothing calls this method > so this is ruled out > 2) when the log gets too large or too old, but neither of those checks should > have been hitting here. > 3) In a loop that is executed (while (!success)) in the > TabletServerLogger.write method. In this case when we unsuccessfullty write > something to the wal, then that one is closed and a new one is created. This > loop will go forever until we successfully write out the entry. A > DfsLogger.LogClosedException seems the most logical reason. This is most > likely because a ClosedChannelException was thrown from the DfsLogger.write > methods (around line 609 in DfsLogger). > So the root cause was most likely hadoop related. However in accumulo we > probably should not be doing a tight retry loop around a hadoop failure. I > recommend at a minimum doing some sort of exponential back off and perhaps > setting a limit on the number of retries resulting in a critical tserver > failure. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4776) Monitor advertises wrong address in ZooKeeper when binding to all interfaces (0.0.0.0)
[ https://issues.apache.org/jira/browse/ACCUMULO-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4776: - Labels: pull-request-available (was: ) > Monitor advertises wrong address in ZooKeeper when binding to all interfaces > (0.0.0.0) > -- > > Key: ACCUMULO-4776 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4776 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.6.6, 1.7.2, 1.8.0 >Reporter: John Vines >Assignee: Christopher Tubbs > Labels: pull-request-available > Fix For: 1.7.4, 1.8.2 > > > ACCUMULO-2065 fixed this by explicitly using the local host name as the > hostname to advertise. This was rolled back in ACCUMULO-4032 because the 2065 > change broke the ability to specific which specific name/ip to bind to. This > means that log forwarding does not work in clusters using > ACCUMULO_MONITOR_BIND_ALL (originally introduced under ACCUMULO-1985) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4770) Accumulo monitor overview page is not listing all Zookeeper nodes
[ https://issues.apache.org/jira/browse/ACCUMULO-4770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4770: - Labels: pull-request-available (was: ) > Accumulo monitor overview page is not listing all Zookeeper nodes > - > > Key: ACCUMULO-4770 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4770 > Project: Accumulo > Issue Type: Bug > Components: monitor >Affects Versions: 2.0.0 > Environment: Ran Accumulo 2.0.0-SNAPSHOT on a cluster with 3 > Zookeeper nodes. Only one is being listed in the Accumulo monitor overview > page. >Reporter: Mike Walch >Assignee: Christopher Tubbs >Priority: Blocker > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4771) Replace tables in Monitor with DataTables
[ https://issues.apache.org/jira/browse/ACCUMULO-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4771: - Labels: pull-request-available (was: ) > Replace tables in Monitor with DataTables > - > > Key: ACCUMULO-4771 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4771 > Project: Accumulo > Issue Type: Improvement > Components: monitor >Affects Versions: 2.0.0 >Reporter: Michael Miller >Assignee: Michael Miller > Labels: pull-request-available > Fix For: 2.0.0 > > > I found this Javascript library: https://datatables.net/ > I think this would give us everything we need to display data in the Monitor. > It would take some work to get it working but it would eliminate a lot of > custom code and give us more features. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-1975) Refactor AccumuloConfiguration.instantiateClassProperty and Property.createInstanceFromPropertyName
[ https://issues.apache.org/jira/browse/ACCUMULO-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-1975: - Labels: newbie pull-request-available (was: newbie) > Refactor AccumuloConfiguration.instantiateClassProperty and > Property.createInstanceFromPropertyName > --- > > Key: ACCUMULO-1975 > URL: https://issues.apache.org/jira/browse/ACCUMULO-1975 > Project: Accumulo > Issue Type: Improvement >Reporter: Bill Havanki >Assignee: Al Krinker >Priority: Minor > Labels: newbie, pull-request-available > Fix For: 2.0.0 > > > ... because they do the same thing. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4528) importtable and exporttable deserve descriptions in the user manual
[ https://issues.apache.org/jira/browse/ACCUMULO-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4528: - Labels: newbie pull-request-available (was: newbie) > importtable and exporttable deserve descriptions in the user manual > --- > > Key: ACCUMULO-4528 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4528 > Project: Accumulo > Issue Type: Task > Components: docs >Reporter: Josh Elser >Assignee: Mark Owens > Labels: newbie, pull-request-available > Fix For: 2.0.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > The user manual has a section on exporttable in > http://accumulo.apache.org/1.7/accumulo_user_manual.html#_exporting_tables > but this is just a pointer out to a readme file. > We should really make this a proper chapter to avoid making users have two > hops to get to the documentation. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4774) Conditional Writer creates a non daemon thread that is keeping processes alive
[ https://issues.apache.org/jira/browse/ACCUMULO-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4774: - Labels: pull-request-available (was: ) > Conditional Writer creates a non daemon thread that is keeping processes alive > -- > > Key: ACCUMULO-4774 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4774 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.7.3, 1.8.1 >Reporter: Keith Turner > Labels: pull-request-available > Fix For: 1.7.4, 1.8.2, 2.0.0 > > > The Conditional writer has a static thread pool that creates a non-daemon > thread that keeping processes alive. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4746) Create Builder for Mutation
[ https://issues.apache.org/jira/browse/ACCUMULO-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4746: - Labels: pull-request-available (was: ) > Create Builder for Mutation > --- > > Key: ACCUMULO-4746 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4746 > Project: Accumulo > Issue Type: Sub-task > Components: client >Reporter: Keith Turner >Assignee: Benjamin F > Labels: pull-request-available > Fix For: 2.0.0 > > > Accumulo needs a builder for mutation similar to the one that was added for > Key. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4769) Sanity check for valid CryptoModule and KeyEncryptionStrategy in config
[ https://issues.apache.org/jira/browse/ACCUMULO-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4769: - Labels: pull-request-available (was: ) > Sanity check for valid CryptoModule and KeyEncryptionStrategy in config > --- > > Key: ACCUMULO-4769 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4769 > Project: Accumulo > Issue Type: Improvement >Reporter: Nick Felts >Assignee: Nick Felts > Labels: pull-request-available > > The current code will log a warning and default to an unencrypted mode if an > unknown class is provided for the cryptomodule or for the > keyencryptionstrategy. These should be checked when they are read from the > config to confirm they exist and implement the appropriate interface. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4767) Remove duplicate code in DefaultCryptoModule
[ https://issues.apache.org/jira/browse/ACCUMULO-4767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4767: - Labels: pull-request-available (was: ) > Remove duplicate code in DefaultCryptoModule > > > Key: ACCUMULO-4767 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4767 > Project: Accumulo > Issue Type: Improvement >Reporter: Nick Felts >Assignee: Nick Felts >Priority: Trivial > Labels: pull-request-available > > Removing duplicate code for creating a key > Adding a check to BlockedOutputStream to make sure an array exists before > using it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4764) Move static html from JS to templates
[ https://issues.apache.org/jira/browse/ACCUMULO-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4764: - Labels: pull-request-available (was: ) > Move static html from JS to templates > - > > Key: ACCUMULO-4764 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4764 > Project: Accumulo > Issue Type: Improvement > Components: monitor >Affects Versions: 2.0.0 >Reporter: Michael Miller >Assignee: Michael Miller > Labels: pull-request-available > Fix For: 2.0.0 > > > The new Monitor has too much code embedded in the Javascript. A lot of it is > just static html code that should be created in the freemarker templates > before the JS is loaded. For example, all of the table column descriptions > used for tool tips are loaded into [one massive JS > Arrary|https://github.com/apache/accumulo/blob/master/server/monitor/src/main/resources/org/apache/accumulo/monitor/resources/js/global.js]. > Moving the static code to the templates will help greatly with maintenance, > specifically keeping external Javascript libraries up to date. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4749) Need a bulk loading test equivalent to continuous ingest
[ https://issues.apache.org/jira/browse/ACCUMULO-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4749: - Labels: pull-request-available (was: ) > Need a bulk loading test equivalent to continuous ingest > > > Key: ACCUMULO-4749 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4749 > Project: Accumulo > Issue Type: Improvement > Components: test >Reporter: Ivan Bella >Assignee: Jared R > Labels: pull-request-available > > There are some known cases at least in past versions where bulk loading may > fail leaving the ~blip in place but no transaction left to handle it. This > will result in directories of files being left around that are not loaded. > We should create a continuous ingest variant that uses bulk loading instead. > Then if this is run with agitation, the continuous ingest verification can > find data that has been essentially orphaned. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-3902) Ensure [Batch]Scanners are closed in ITs
[ https://issues.apache.org/jira/browse/ACCUMULO-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-3902: - Labels: newbie pull-request-available (was: newbie) > Ensure [Batch]Scanners are closed in ITs > > > Key: ACCUMULO-3902 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3902 > Project: Accumulo > Issue Type: Bug > Components: test >Reporter: Josh Elser >Assignee: Jared R >Priority: Trivial > Labels: newbie, pull-request-available > Fix For: 1.7.4, 1.8.2, 2.0.0 > > > Do an audit of the integration tests to verify that we actually close > Scanners and BatchScanners. This is a best practice that we should be > encouraging (by doing it in our tests). It can also lead to bugs in other > test cases (ACCUMULO-3888). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4751) Some WALs don't replicate due to lacking a createdTime entry
[ https://issues.apache.org/jira/browse/ACCUMULO-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4751: - Labels: pull-request-available (was: ) > Some WALs don't replicate due to lacking a createdTime entry > > > Key: ACCUMULO-4751 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4751 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.7.3, 1.8.1 >Reporter: Adam J Shook >Assignee: Adam J Shook > Labels: pull-request-available > Attachments: repl_logs.txt > > > From what I can tell, the below error is thrown when no data for a particular > table is written to a WAL, but the file is closed. This would be because the > {{Status}} entry from the {{StatusUtil}} for {{fileClosed}} is pre-built and > therefore does not have a {{createdTime}}. This prevents a WAL from being > replicated until a {{createdTime}} entry is added manually. > From the Accumulo master: > {code} > Status record ([begin: 0 end: 0 infiniteEnd: true closed: true]) for > hdfs://namenode:9000/accumulo/wal/tserver.example.com+31732/f922df9c-3ffc-49ee-8d0c-261c7a05fea2 > in table 7l was written to metadata table which lacked createdTime > {code} > There are two solutions I have in mind: > 1. Update the {{StatusUtil}} such that every returned {{Status}} object sets > the {{createdTime}} to {{System.currentTimeMillis}} if not explicitly given. > 2. Update the Accumulo Master to set the {{createdTime}} to the WAL's > modification time in HDFS if the WAL is closed but there is no > {{createdTime}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4763) Avoid use of ambiguous term 'file' in property descriptions
[ https://issues.apache.org/jira/browse/ACCUMULO-4763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4763: - Labels: pull-request-available (was: ) > Avoid use of ambiguous term 'file' in property descriptions > --- > > Key: ACCUMULO-4763 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4763 > Project: Accumulo > Issue Type: Task > Components: core >Affects Versions: 2.0.0 > Environment: In the configuration property descriptions, 'file' is > used in place of 'RFile' or 'write-ahead log'. Properties would be easier to > understand if the descriptions were more explicit. >Reporter: Mike Walch >Assignee: Mike Walch >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4732) No APIs to configure iterators and locality groups for new table
[ https://issues.apache.org/jira/browse/ACCUMULO-4732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4732: - Labels: newbie pull-request-available (was: newbie) > No APIs to configure iterators and locality groups for new table > > > Key: ACCUMULO-4732 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4732 > Project: Accumulo > Issue Type: Improvement >Reporter: Keith Turner >Assignee: Mark Owens > Labels: newbie, pull-request-available > Fix For: 2.0.0 > > > In Accumulo 1.7 the ability to set table properties at table creation time > was added. For existing tables there are APIs in table operations that allow > setting locality groups and iterators for existing tables. When setting > table properties at table creation time there is not good API for iterators > and locality groups. There should be some way in the API to do this. There > may be other things besides iterators and locality groups that should also be > supported at table creation time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4761) Add documentation on canceling compaction
[ https://issues.apache.org/jira/browse/ACCUMULO-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4761: - Labels: pull-request-available (was: ) > Add documentation on canceling compaction > - > > Key: ACCUMULO-4761 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4761 > Project: Accumulo > Issue Type: Task > Components: docs >Affects Versions: 2.0.0 >Reporter: Mike Walch >Assignee: Mike Walch > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4755) Register a custom serializer for AbstractId types (Table.ID and Namespace.ID)
[ https://issues.apache.org/jira/browse/ACCUMULO-4755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4755: - Labels: pull-request-available (was: ) > Register a custom serializer for AbstractId types (Table.ID and Namespace.ID) > - > > Key: ACCUMULO-4755 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4755 > Project: Accumulo > Issue Type: Task > Components: monitor >Reporter: Christopher Tubbs >Assignee: Benjamin F >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > > As quick follow-on to ACCUMULO-4745, I think it might be worth taking a look > at registering a custom serializer for AbstractId types, so we can keep a > reference to the original Table.ID objects in the REST POJOs, and still have > the serialization do the right thing. > I found this wiki, which may be helpful. > https://github.com/FasterXML/jackson-docs/wiki/JacksonHowToCustomSerializers -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4758) Summaries command fails on ShellServerIT
[ https://issues.apache.org/jira/browse/ACCUMULO-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4758: - Labels: pull-request-available (was: ) > Summaries command fails on ShellServerIT > > > Key: ACCUMULO-4758 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4758 > Project: Accumulo > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Michael Miller > Labels: pull-request-available > > The ShellServerIT.testSummarySelection has been failing recently. I believe > its due to a recent change from ACCUMULO-4641. It appears to be a thread > concurrency issue in the CachableBlockFile. Here is the stack trace: > {code:java} > 2017-12-07 18:30:47,617 [thrift.ProcessFunction] ERROR: Internal error > processing startGetSummariesFromFiles > org.apache.thrift.TException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.io.UncheckedIOException: > org.apache.accumulo.core.file.rfile.bcfile.MetaBlockDoesNotExist: > name=accumulo.summaries.index > at > org.apache.accumulo.server.rpc.RpcWrapper$1.invoke(RpcWrapper.java:63) > at com.sun.proxy.$Proxy21.startGetSummariesFromFiles(Unknown Source) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startGetSummariesFromFiles.getResult(TabletClientService.java:3335) > at > org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startGetSummariesFromFiles.getResult(TabletClientService.java:3319) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:63) > at > org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518) > at > org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:106) > at org.apache.thrift.server.Invocation.run(Invocation.java:18) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.io.UncheckedIOException: > org.apache.accumulo.core.file.rfile.bcfile.MetaBlockDoesNotExist: > name=accumulo.summaries.index > at > org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.getSummaries(TabletServer.java:1822) > at > org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.startSummaryOperation(TabletServer.java:1834) > at > org.apache.accumulo.tserver.TabletServer$ThriftClientHandler.startGetSummariesFromFiles(TabletServer.java:1898) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.accumulo.core.trace.wrappers.RpcServerInvocationHandler.invoke(RpcServerInvocationHandler.java:46) > at > org.apache.accumulo.server.rpc.RpcWrapper$1.invoke(RpcWrapper.java:60) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4742) Put Monitor resources in fully qualified package
[ https://issues.apache.org/jira/browse/ACCUMULO-4742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4742: - Labels: pull-request-available (was: ) > Put Monitor resources in fully qualified package > > > Key: ACCUMULO-4742 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4742 > Project: Accumulo > Issue Type: Improvement > Components: monitor >Affects Versions: 2.0.0 >Reporter: Michael Miller >Assignee: Christopher Tubbs >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > > The resources in the Monitor should be in a fully qualified package... i.e. > org/apache/accumulo/monitor/ > This will prevent problems from occurring when deployed on a cluster. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4756) New Monitor Validation fails for default tables
[ https://issues.apache.org/jira/browse/ACCUMULO-4756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4756: - Labels: pull-request-available (was: ) > New Monitor Validation fails for default tables > --- > > Key: ACCUMULO-4756 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4756 > Project: Accumulo > Issue Type: Bug > Components: monitor >Affects Versions: 2.0.0 >Reporter: Michael Miller >Assignee: Michael Miller > Labels: pull-request-available > Fix For: 2.0.0 > > > With the new validation rules introduced in ACCUMULO-4677, the getTables > validation will fail on all of the default accumulo table IDs. The > ALPHA_NUM_REGEX doesn't like "+" or the "!0" ID. > I am not sure what the best way to fix this would be. We could either add a > more complicated regex that will allow the special IDs or change the REST > interface to use table names instead of IDs. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4754) Fix links to properties in 2.0 documentation
[ https://issues.apache.org/jira/browse/ACCUMULO-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4754: - Labels: pull-request-available (was: ) > Fix links to properties in 2.0 documentation > > > Key: ACCUMULO-4754 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4754 > Project: Accumulo > Issue Type: Bug > Components: website >Affects Versions: 2.0.0 >Reporter: Mike Walch >Assignee: Mike Walch > Labels: pull-request-available > Fix For: 2.0.0 > > > If you click on a link for a configuration property in 2.0. It will go to > property but property will be hidden below navbar. This can be fixed by > adding hidden whitespace. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4752) Create troubleshooting page on improving performance
[ https://issues.apache.org/jira/browse/ACCUMULO-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4752: - Labels: pull-request-available (was: ) > Create troubleshooting page on improving performance > > > Key: ACCUMULO-4752 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4752 > Project: Accumulo > Issue Type: Task >Affects Versions: 2.0.0 >Reporter: Mike Walch >Assignee: Mike Walch > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4747) Create a unified upgrade reference
[ https://issues.apache.org/jira/browse/ACCUMULO-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4747: - Labels: documentation pull-request-available (was: documentation) > Create a unified upgrade reference > -- > > Key: ACCUMULO-4747 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4747 > Project: Accumulo > Issue Type: Improvement > Components: docs >Reporter: Mark Owens >Assignee: Mark Owens >Priority: Minor > Labels: documentation, pull-request-available > > It could be helpful to have a page which provides upgrade instructions for > the various releases collected into one location. This page could then be > updated as needed as new versions are released. Prompted by someone needing > to upgrade for 1.4 to 1.8 and looking for a good location to find what steps > would be needed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4750) Improve block cache documentation
[ https://issues.apache.org/jira/browse/ACCUMULO-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4750: - Labels: pull-request-available (was: ) > Improve block cache documentation > - > > Key: ACCUMULO-4750 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4750 > Project: Accumulo > Issue Type: Task >Reporter: Mike Walch >Assignee: Mike Walch >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-3970) Generating multiple views of a value at scan time
[ https://issues.apache.org/jira/browse/ACCUMULO-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-3970: - Labels: pull-request-available (was: ) > Generating multiple views of a value at scan time > - > > Key: ACCUMULO-3970 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3970 > Project: Accumulo > Issue Type: New Feature >Reporter: Russ Weeks >Priority: Minor > Labels: pull-request-available > > It would be useful to have the ability to generate different representations > of a key-value pair at scan time, based on the scan authorizations. > For example, consider [HIPPA safe harbour > de-identification|http://www.hhs.gov/ocr/privacy/hipaa/understanding/coveredentities/De-identification/guidance.html#dates]. > One of the rules for de-identifying a patient's date of birth is that if a > patient is 89 years old or younger, you can disclose his exact year of birth. > If a patient is 90 years old or over, you pretend that he's 90 years old. > You can imagine implementing this as a key/value mapping in accumulo like, > {{(pt_id, demographic, pt_dob, PII_DOB) -> "1925-08-22"}} > {{(pt_id, demographic, pt_dob, SHD_DOB) -> "1925"}} > Where the value corresponding to visibility SHD_DOB is produced at scan-time, > depending on the patient's current age. > Another example would be the ability to produce a salted hash of a unique > identifier like a social security number or medical record number, where the > salt (or the hash algorithm, or the work factor...) could be specified > dynamically without having to re-code all the values in the system. > More broadly speaking, this feature would give organizations more flexibility > to change how they deidentify, transform or anonymize data to suit different > access levels. > Of course, to do this you'd need to have a pluggable component that can > process key/value pairs before visibilities are evaluated. I can see why this > might give a lot of people the heeby-jeebies but I'd like to gather as much > feedback as possible. Looking forward to hearing your thoughts! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-3185) Remove "walogs" directory from MAC instances
[ https://issues.apache.org/jira/browse/ACCUMULO-3185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-3185: - Labels: newbie pull-request-available (was: newbie) > Remove "walogs" directory from MAC instances > > > Key: ACCUMULO-3185 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3185 > Project: Accumulo > Issue Type: Task > Components: mini >Affects Versions: 1.5.2, 1.6.1 >Reporter: Josh Elser >Priority: Trivial > Labels: newbie, pull-request-available > Fix For: 2.0.0 > > > Each MAC installation still makes a "walogs" directory which I'm guessing is > a holdover from pre-HDFS-WALs in 1.4. The directory appears to always be > empty and should not be created. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4743) Use tserver prefix for Cache config instead of general custom
[ https://issues.apache.org/jira/browse/ACCUMULO-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4743: - Labels: pull-request-available (was: ) > Use tserver prefix for Cache config instead of general custom > - > > Key: ACCUMULO-4743 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4743 > Project: Accumulo > Issue Type: Bug >Reporter: Keith Turner > Labels: pull-request-available > Fix For: 2.0.0 > > > In ACCUMULO-4463 caching implementations were made configurable. At the time > config for a cache was placed under the {{general.custom.cache}} prefix. The > idea was that a cache impl would be passed in a AccumuloConfiguration object > and it could grab whatever it wanted from {{general.custom.cache}}. > In ACCUMULO-4644 the cache API was changed to not use AccumuloConfiguration > because that type is not in the public API. This removed the reason for > using general.custom in the first place. > When trying to implement an external cache I discovered that I can not set > general.custom props in zookeeper. However the cache class is a tserver > prefixed property can can be set in zookeeper. For the > [accumulo-ohc|https://github.com/keith-turner/accumulo-ohc] I wanted to do > the following in the shell but could not set the general props. > {noformat} > config -s general.custom.cache.ohc.data.on-heap.maximumWeight=10 > config -s general.custom.cache.ohc.data.off-heap.capacity=100 > config -s tserver.cache.manager.class=accumulo.ohc.OhcCacheManager > {noformat} > I think it would be nice to change cache config prefix from > {{general.custom.cache..}} to > {{tserver.cache.config..}}. Doing this would enable > setting up a cache in the shell like the following. > {noformat} > config -s tserver.cache.config.data.on-heap.maximumWeight=10 > config -s tserver.cache.config.data.off-heap.capacity=100 > config -s tserver.cache.manager.class=accumulo.ohc.OhcCacheManager > {noformat} > Part of the code for this prefix is at [BlockCacheManager.java line > 33|https://github.com/apache/accumulo/blob/d877a2df2943e48d70d99b96616844d0dff9a501/core/src/main/java/org/apache/accumulo/core/file/blockfile/cache/BlockCacheManager.java#L33] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4546) IllegalTableTransitionException should include a default message that logs the requested state transition
[ https://issues.apache.org/jira/browse/ACCUMULO-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4546: - Labels: pull-request-available (was: ) > IllegalTableTransitionException should include a default message that logs > the requested state transition > - > > Key: ACCUMULO-4546 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4546 > Project: Accumulo > Issue Type: Bug > Components: server-base >Affects Versions: 1.6.6 >Reporter: John Vines >Assignee: Mark Owens > Labels: pull-request-available > Fix For: 1.7.4, 1.8.2, 2.0.0 > > > While trying to track down the root of an Illegal state transition for a > table, I hit a dead end when the original transition to bring a table online > failed. The IllegalTableTransitionException takes in the old and new states > in the contstructor for the exception, but these states are not used to > construct any sort of message, so this information isn't available in the > logs. We should have a default message for this constructor. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4626) improve cache hit rate via weak reference map
[ https://issues.apache.org/jira/browse/ACCUMULO-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4626: - Labels: performance pull-request-available stability (was: performance stability) > improve cache hit rate via weak reference map > - > > Key: ACCUMULO-4626 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4626 > Project: Accumulo > Issue Type: Improvement > Components: tserver >Reporter: Adam Fuchs > Labels: performance, pull-request-available, stability > Time Spent: 1h > Remaining Estimate: 0h > > When a single iterator tree references the same RFile blocks in different > branches we sometimes get cache misses for one iterator even though the > requested block is held in memory by another iterator. This is particularly > important when using something like the IntersectingIterator to intersect > many deep copies. Instead of evicting completely, keeping evicted blocks into > a WeakReference value map can avoid re-reading blocks that are currently > referenced by another deep copied source iterator. > We've seen this in the field for some of Sqrrl's queries against very large > tablets. The total memory usage for these queries can be equal to the size of > all the iterator block reads times the number of readahead threads times the > number of files times the number of IntersectingIterator children when cache > miss rates are high. This might work out to something like: > {code} > 16 readahead threads * 200 deep copied children * 99% cache miss rate * 20 > files * 252KB per reader = ~16GB of memory > {code} > In most cases, evicting to a weak reference value map changes the cache miss > rate from very high to very low and has a dramatic effect on total memory > usage. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4392) Expose metrics2 JVM metrics
[ https://issues.apache.org/jira/browse/ACCUMULO-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4392: - Labels: pull-request-available (was: ) > Expose metrics2 JVM metrics > --- > > Key: ACCUMULO-4392 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4392 > Project: Accumulo > Issue Type: Improvement >Reporter: Dave Marion > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The Hadoop services create a JVMMetrics instance in their server metrics > objects (e.g. DataNodeMetrics) to export internal JVM metrics. It does not > appear that this is happening for the Accumulo server processes. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4669) RFile can create very large blocks when key statistics are not uniform
[ https://issues.apache.org/jira/browse/ACCUMULO-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4669: - Labels: pull-request-available (was: ) > RFile can create very large blocks when key statistics are not uniform > -- > > Key: ACCUMULO-4669 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4669 > Project: Accumulo > Issue Type: Bug > Components: core >Affects Versions: 1.7.2, 1.7.3, 1.8.0, 1.8.1 >Reporter: Adam Fuchs >Assignee: Keith Turner >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.4, 1.8.2, 2.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > RFile.Writer.append checks for giant keys and avoid writing them as index > blocks. This check is flawed and can result in multi-GB blocks. In our case, > a 20GB compressed RFile had one block with over 2GB raw size. This happened > because the key size statistics changed after some point in the file. The > code in question follows: > {code} > private boolean isGiantKey(Key k) { > // consider a key thats more than 3 standard deviations from previously > seen key sizes as giant > return k.getSize() > keyLenStats.getMean() + > keyLenStats.getStandardDeviation() * 3; > } > ... > if (blockWriter == null) { > blockWriter = fileWriter.prepareDataBlock(); > } else if (blockWriter.getRawSize() > blockSize) { > ... > if ((prevKey.getSize() <= avergageKeySize || blockWriter.getRawSize() > > maxBlockSize) && !isGiantKey(prevKey)) { > closeBlock(prevKey, false); > ... > {code} > Before closing a block that has grown beyond the target block size we check > to see that the key is below average in size or that the block is 1.1 times > the target block size (maxBlockSize), and we check that the key isn't a > "giant" key, or more than 3 standard deviations from the mean of keys seen so > far. > Our RFiles often have one row of data with different column families > representing various forward and inverted indexes. This is a table design > similar to the WikiSearch example. The first column family in this case had > very uniform, relatively small key sizes. This first column family comprised > gigabytes of data, split up into roughly 100KB blocks. When we switched to > the next column family the keys grew in size, but were still under about 100 > bytes. The statistics of the first column family had firmly established a > smaller mean and tiny standard deviation (approximately 0), and it took over > 2GB of larger keys to bring the standard deviation up enough so that keys > were no longer considered "giant" and the block could be closed. > Now that we're aware, we see large blocks (more than 10x the target block > size) in almost every RFile we write. This only became a glaring problem when > we got OOM exceptions trying to decompress the block, but it also shows up in > a number of subtle performance problems, like high variance in latencies for > looking up particular keys. > The fix for this should produce bounded RFile block sizes, limited to the > greater of 2x the maximum key/value size in the block and some configurable > threshold, such as 1.1 times the compressed block size. We need a firm cap to > be able to reason about memory usage in various applications. > The following code produces arbitrarily large RFile blocks: > {code} > FileSKVWriter writer = RFileOperations.getInstance().openWriter(filename, > fs, conf, acuconf); > writer.startDefaultLocalityGroup(); > SummaryStatistics keyLenStats = new SummaryStatistics(); > Random r = new Random(); > byte [] buffer = new byte[minRowSize]; > for(int i = 0; i < 10; i++) { > byte [] valBytes = new byte[valLength]; > r.nextBytes(valBytes); > r.nextBytes(buffer); > ByteBuffer.wrap(buffer).putInt(i); > Key k = new Key(buffer, 0, buffer.length, emptyBytes, 0, 0, emptyBytes, > 0, 0, emptyBytes, 0, 0, 0); > Value v = new Value(valBytes); > writer.append(k, v); > keyLenStats.addValue(k.getSize()); > int newBufferSize = Math.max(buffer.length, (int) > Math.ceil(keyLenStats.getMean() + keyLenStats.getStandardDeviation() * 4 + > 0.0001)); > buffer = new byte[newBufferSize]; > if(keyLenStats.getSum() > targetSize) > break; > } > writer.close(); > {code} > One telltale symptom of this bug is an OutOfMemoryException thrown from a > readahead thread with message "Requested array size exceeds VM limit". This > will only happen if the block cache size is big enough to hold the expected > raw block size, 2GB in our case. This message is rare, and really only > happens when allocating an array of size Integer.MAX_VALUE or >
[jira] [Updated] (ACCUMULO-4693) Add instance-specific ProcessName to Hadoop2 metrics
[ https://issues.apache.org/jira/browse/ACCUMULO-4693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4693: - Labels: pull-request-available (was: ) > Add instance-specific ProcessName to Hadoop2 metrics > > > Key: ACCUMULO-4693 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4693 > Project: Accumulo > Issue Type: Improvement >Affects Versions: 1.8.1 >Reporter: Bill Oley >Priority: Minor > Labels: pull-request-available > Fix For: 1.8.2 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Current Hadoop2 metrics from the tablet servers are not labelled by instance > when multiple instances are run on the same host. > By adding a ProcessName tag (Master, TaletServer-1, TabletServer-2, > GarbageCollector, etc) to the MetricsRegistry, this tag will be included on > all metrics prodeced and allow the Metrics2 sinks to use this information to > deconflict these metrics. > The relevant service and instance information can be found in the system > properties: > -Dapp and -Daccumulo.service.instance written by the accumulo script. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4419) Create Compressor factory allowing Compression settings to be updated
[ https://issues.apache.org/jira/browse/ACCUMULO-4419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4419: - Labels: core pull-request-available (was: core) > Create Compressor factory allowing Compression settings to be updated > - > > Key: ACCUMULO-4419 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4419 > Project: Accumulo > Issue Type: Improvement >Reporter: marco polo >Assignee: marco polo >Priority: Minor > Labels: core, pull-request-available > Fix For: 1.7.4, 1.8.2, 2.0.0 > > Time Spent: 4h 50m > Remaining Estimate: 0h > > This ticket is to account for work done elsewhere in which I've made the > compression pool configurable such that we either don't use the pool at all > or use an adjustable pool based on commons-pool > Other configuration options are now updated through a CompressionUpdate > mechanism. > This PR will move us away from CodecPool, but will allow us greater control > over trimming codecs from the pool itself. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4356) Stop bundling dependencies in binary tarball
[ https://issues.apache.org/jira/browse/ACCUMULO-4356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4356: - Labels: pull-request-available (was: ) > Stop bundling dependencies in binary tarball > > > Key: ACCUMULO-4356 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4356 > Project: Accumulo > Issue Type: Improvement > Components: build >Reporter: Christopher Tubbs >Assignee: Christopher Tubbs > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > Currently, our binary tarball is built containing our own built jars, > scripts, etc. as well as some extra dependencies which we assume are not > available in either the Hadoop lib directory or the ZooKeeper lib directory. > These assumptions are tenuous, since we do not know what environment a user > is going to be running in, and which jars they already have installed on > their system, provided by their classpath or otherwise. Nor do we know > whether the specific versions of the dependencies we're bundling for the user > are compatible with what they have on their system. > What we are trying to do is make things convenient for a user, by performing > integration/packager/dependency convergence tasks on behalf of the user... > all based on poorly defined assumptions. > This bundling also adds an extra burden on us, as the upstream project, to > maintain complex LICENSE/NOTICE files for the bundled tarball artifact we > produce, and it's *very* easy for these legal files to unintentionally become > out of sync when we change a version of a dependency we are bundling. > We should not bundle any dependencies inside our binary tarball. For > convenience, we can instead provide a script which allows the user to easily > download the dependencies we're currently assuming they will need (the same > ones we're currently packaging for them). This will provide nearly the same > convenience as we are currently providing, but in a way which does not > require burdensome maintenance on our LICENSE/NOTICE files, and in a way that > the user could easily customize this script to download the dependencies they > actually need, if our assumptions aren't valid for their environment. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4745) Monitor Table hyperlinks are broken
[ https://issues.apache.org/jira/browse/ACCUMULO-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4745: - Labels: pull-request-available (was: ) > Monitor Table hyperlinks are broken > --- > > Key: ACCUMULO-4745 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4745 > Project: Accumulo > Issue Type: Bug > Components: monitor >Affects Versions: 2.0.0 >Reporter: Kyle Van Gilson > Labels: pull-request-available > > In the accumulo-monitor Table Status page, the hyperlinked table list items > have malformed links of the form: > * http://localhost:9995/tables/[object%20Object] > The should be of the form > * http://localhost:9995/tables/ > * i.e. http://localhost:9995/tables/1 > I'm not 100% sure where the issue is, but it looks like the > server/monitor/resources/js/tables.js ~line 187 might be where the link > itself is being generated. > To reproduce create an assembly off of the master branch (2.0.0-snapshot), > init/start accumulo master/tserver/monitor, go to > http://localhost:9995/tables/ and any of the table links should exhibit the > issue. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-2341) stop-all.sh hung when Accumulo already down
[ https://issues.apache.org/jira/browse/ACCUMULO-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-2341: - Labels: newbie pull-request-available (was: newbie) > stop-all.sh hung when Accumulo already down > --- > > Key: ACCUMULO-2341 > URL: https://issues.apache.org/jira/browse/ACCUMULO-2341 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.6.0 >Reporter: Dave Marion >Priority: Trivial > Labels: newbie, pull-request-available > > {noformat} > "admin" prio=10 tid=0x40d09800 nid=0x124f waiting on condition > [0x7f85db747000] >java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.accumulo.core.util.UtilWaitThread.sleep(UtilWaitThread.java:26) > at > org.apache.accumulo.core.client.impl.MasterClient.getConnectionWithRetry(MasterClient.java:48) > at > org.apache.accumulo.core.client.impl.MasterClient.executeGeneric(MasterClient.java:126) > at > org.apache.accumulo.core.client.impl.MasterClient.execute(MasterClient.java:171) > at org.apache.accumulo.server.util.Admin.stopServer(Admin.java:292) > at org.apache.accumulo.server.util.Admin.main(Admin.java:195) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.accumulo.start.Main$1.run(Main.java:137) > at java.lang.Thread.run(Thread.java:662) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4744) Using RFile API with cache and multiple files hides data
[ https://issues.apache.org/jira/browse/ACCUMULO-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4744: - Labels: pull-request-available (was: ) > Using RFile API with cache and multiple files hides data > > > Key: ACCUMULO-4744 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4744 > Project: Accumulo > Issue Type: Bug >Affects Versions: 1.8.0, 1.8.1 >Reporter: Keith Turner >Priority: Critical > Labels: pull-request-available > Fix For: 1.8.2 > > > Noticed this bug in source code while working on ACCUMULO-4641. When using > the RFile API introduced in 1.8 to read from multiple files with cache > enabled, not all data may be seen. This happens because internally the code > gives all input sources the same cache id. Therefore index and data blocks > from multiple files collide in the cache. > This bug does not happen when reading data through tserver, only the RFile > API. > {code:java} > Scanner scanner = >RFile.newScanner() >.from(file1, file2, file3) //multiple input files >.withFileSystem(localFs) >.withIndexCache(100) //enabled cache >.withDataCache(1000) //enabled cache >.build(); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4739) Make 3rd party web resources (js, css) location configurable
[ https://issues.apache.org/jira/browse/ACCUMULO-4739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4739: - Labels: pull-request-available (was: ) > Make 3rd party web resources (js, css) location configurable > > > Key: ACCUMULO-4739 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4739 > Project: Accumulo > Issue Type: Task > Components: monitor >Reporter: Christopher Tubbs >Assignee: Michael Miller >Priority: Blocker > Labels: pull-request-available > Fix For: 2.0.0 > > > Currently, in the new monitor for 2.0 (after ACCUMULO-3005), some 3rd party > web resources are accessed via an external CDN. This is suitable in many > cases, but could be problematic for client browsers not currently connected > to the internet or with a cached copy of the resources from the CDN. > These resources include bootstrap and jquery. Flot is also a 3rd party > resource, but is currently bundled with Accumulo and served by the monitor. > The location of these resources should be made configurable, so that they can > be bundled with, and served by, the Accumulo monitor instead of a > internet-based CDN. Making the locations configurable also makes it possible > for users to update, if there's a bug in a particular version of jquery that > the administrator wishes to avoid, or they want to use a different bootstrap > theme, for example. > Any new configuration option added to support making these configurable > should be capable of supporting an arbitrary number of script and stylesheet > resources, and possibly other resource types, as well as any accompanying > integrity/crossorigin attributes for CDN access (see > server/monitor/src/main/resources/templates/default.ftl for current values). > Also, I think the default value should be to point to the CDN, and not the > locally bundled and served resources, so that the browser can take advantage > of any caching for these commonly used resources. This would allow us to > achieve ACCUMULO-2983 by stopping bundling these third party resources, but > still supporting bundling, if needed. > To complete this issue, we basically need 2 things: > # Ensure monitor serves (to a predictable location) whatever arbitrary static > resources it finds on the class path (so users can bundle their own static > resources), and > # Ensure resources are configurable to point to the served versions or > versions in a CDN. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4740) Enable GCM mode for crypto
[ https://issues.apache.org/jira/browse/ACCUMULO-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4740: - Labels: pull-request-available (was: ) > Enable GCM mode for crypto > -- > > Key: ACCUMULO-4740 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4740 > Project: Accumulo > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Nick Felts >Assignee: Nick Felts >Priority: Minor > Labels: pull-request-available > > Enable the use of GCM as an optional encryption mode. > While this change will allow for GCM, it should probably only be used for > Java 9 and later. > https://docs.oracle.com/javase/9/whatsnew/toc.htm#JSNEW-GUID-71A09701-7412-4499-A88D-53FA8BFBD3D0 > > http://openjdk.java.net/jeps/246 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4714) Create landing page for new developers
[ https://issues.apache.org/jira/browse/ACCUMULO-4714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4714: - Labels: pull-request-available (was: ) > Create landing page for new developers > -- > > Key: ACCUMULO-4714 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4714 > Project: Accumulo > Issue Type: Improvement > Components: website >Reporter: Michael Miller >Assignee: Mark Owens > Labels: pull-request-available > > The website has a lot of good information for contributing to Accumulo but it > is scattered across multiple pages. There is no clear, concise page that can > be sent as a link to developers interested in committing to the project. I > feel like this is a turn off for someone who is interested in contributing to > Accumulo. > This page would be a good place but it is just a bunch of links: > https://accumulo.apache.org/contributor/ > As a recent newcomer I would tend to go here: > https://accumulo.apache.org/contributor/source > But this page is confusing. The first instructions you get (after more > links) explain how to build the website. Then when you get to the developers > guide the the very first thing is a paragraph about activating the Thrift > profile. While this information is all very useful, the first 2 scenarios > are edge cases of development and it does not ease a new developer into > writing code for Accumulo. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4677) Sanitize @PathParam and @QueryParam parameters in new REST-based monitor
[ https://issues.apache.org/jira/browse/ACCUMULO-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4677: - Labels: pull-request-available (was: ) > Sanitize @PathParam and @QueryParam parameters in new REST-based monitor > > > Key: ACCUMULO-4677 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4677 > Project: Accumulo > Issue Type: Bug > Components: monitor >Reporter: Christopher Tubbs >Assignee: Kyle Van Gilson >Priority: Blocker > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > Following on the issue identified in ACCUMULO-4660, I verified that > parameters to the REST-based monitor (ACCUMULO-3005) resources need > sanitization as well. > All {{@PathParam}} and {{@QueryParam}} annotated fields should be sanitized. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4641) Modify BlockCache interface to avoid race conditions
[ https://issues.apache.org/jira/browse/ACCUMULO-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4641: - Labels: pull-request-available (was: ) > Modify BlockCache interface to avoid race conditions > > > Key: ACCUMULO-4641 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4641 > Project: Accumulo > Issue Type: Sub-task >Reporter: Keith Turner >Assignee: Keith Turner > Labels: pull-request-available > Fix For: 2.0.0 > > > Currently the BlockCache interface has functions to get and put. Accumulo > will try to get a block, if it does not exist load it, and then put it in the > cache. This can lead to race conditions where multiple threads unnecessarily > load the same block. > I think it would be better to modify the block cache interface to only have a > function like the following. > {code:java} > CacheEntry get(String blockName, BlockLoader loader) > {code} > BlockLoader represents a function that the cache can call if a block is not > present. The cache implementation can attempt to handle load race conditions > however it likes.. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4730) Create an Entry length summarizer
[ https://issues.apache.org/jira/browse/ACCUMULO-4730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4730: - Labels: newbie pull-request-available (was: newbie) > Create an Entry length summarizer > - > > Key: ACCUMULO-4730 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4730 > Project: Accumulo > Issue Type: Improvement >Reporter: Keith Turner >Assignee: Jared R > Labels: newbie, pull-request-available > Fix For: 2.0.0 > > > It would be very useful to have a built in > [Summarizer|https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/summary/Summarizer.java] > that computes summary information about field lengths. Specifically key > length, row length, family length, qualifier length, visibility length, and > value length. Whatever stats are computed must be able to computed > incrementally. For example can incrementally compute min, max, count, sum, > and log2 histogram. I think these would be good stats to start with. Count > and sum can be used to compute the average. There is an example of computing > a log2 histogram in the Summarizer javadoc. > The Summarizer could be named EntryLenghtSummarizer and possibly produce > summaries like the following. > {noformat} > count=XXX //do not need to track this per field, its the same for all > key.min=XXX > key.max=XXX > key.sum=XXX > key.logHist.8=XXX //only output non zero exponents > key.logHist.9=XXX > row.min=XXX > row.max=XXX > row.sum=XXX > row.logHist.7=XXX > row.logHist.8=XXX > row.logHist.10=XXX > family.min=XXX > family.max=XXX > family.sum=XXX > family.logHist.6=XXX > family.logHist.7=XXX > etc... > {noformat} > This new summarizer would be placed in the > [summarizers|https://github.com/apache/accumulo/tree/master/core/src/main/java/org/apache/accumulo/core/client/summary/summarizers] > package. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (ACCUMULO-4737) Clean up cipher algorithm configuration
[ https://issues.apache.org/jira/browse/ACCUMULO-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4737: - Labels: pull-request-available (was: ) > Clean up cipher algorithm configuration > --- > > Key: ACCUMULO-4737 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4737 > Project: Accumulo > Issue Type: Improvement >Reporter: Nick Felts >Assignee: Nick Felts >Priority: Minor > Labels: pull-request-available > > The two property options: > crypto.cipher.algorithm.name > crypto.cipher.suite > are not used intuitively. For example, as far as I can tell, the only place > the cipher suite's algorithm name is used is to check for NullCipher. I even > tested this using bogus strings to confirm. Instead, once the suite is found > to not indicate NullCipher, the cipher.algorithm.name replaces the algorithm > found in the cipher suite for all further uses. > Further, the suite is parsed out into padding and mode options, which only > exist to pass a few unit tests and reconstruct the cipher suite using the > other specified algorithm. > This leads to some unintuitive behavior, where someone specifying an > algorithm in the cipher suite is not necessarily using their intended > algorithm, unless both options specified the the same algorithm. > To clean this up, the algorithm specified should be renamed and used for key > generation, since some keys can be used across different algorithms > (https://docs.oracle.com/javase/8/docs/api/java/security/Key.html), and the > cipher suite can be used as stated, instead of deconstructing it to then > reconstruct it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)