[jira] [Resolved] (ACCUMULO-2145) Create upgrade test framework

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-2145.

Resolution: Duplicate

https://github.com/apache/accumulo-testing/issues/72

> Create upgrade test framework
> -
>
> Key: ACCUMULO-2145
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2145
> Project: Accumulo
>  Issue Type: Test
>Reporter: Keith Turner
>Assignee: John McNamee
>Priority: Major
> Attachments: ACCUMULO-2145.v2.patch, ACCUMULO-2145.v3.patch, 
> ACCUMULO-2145.v4.patch, updateTest.sh
>
>
> Accumulo upgrade testing in the past has been very minimal and mostly manual. 
>  As a result we have run into upgrade bugs in the past.   It would be nice to 
> have a framework that makes its easy to run and write upgrade test.  
>   * Can be configured to use existing HDFS and zookeeper instances
>   * Can be configured with 1.5.x and 1.6.x branches to build
>   * Supports multiple upgrade scenarios (like clean shutdown, dirty shutdown, 
> etc)
>   * Runs a set of upgrade test (this would a be a list of test to run thats 
> easy to add to e.g. bulk import upgrade test)
> I am thinking the framework could do the following
>  {noformat}
>1. Build or download a version of 1.5
>2. Build or download a version of 1.6
>  
>foreach scenario {
>   foreach upgrade test{
>a. ask test for any 1.5 configuration
>b. ask test for any 1.6 configuration
>c. Unpack and configure 1.5  
>d. Unpack and configure 1.6
>e. Execute pre upgrade step of test
>f. Execute scenario
>g Execute post upgrade step of test
>   }
>}
> {noformat} 
> The framework would configure the Accumulo versions, HDFS, zookeeper, and 
> which test to run.
> It would also be use to write the framework in such a way that it could 
> support chaining upgrade test.  For example run test that upgrades from 1.4 
> to 1.5 to 1.6.  It possible that a fresh install of 1.5 will upgrade w/o 
> problems, but a 1.5 system was upgraded to 1.4 will not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-1962) Improve batch scanner throughput in failure case

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-1962.

Resolution: Duplicate

https://github.com/apache/accumulo/issues/1120

> Improve batch scanner throughput in failure case
> 
>
> Key: ACCUMULO-1962
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1962
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> The batch scanner currently does the following.
>  # bin ranges to tablets servers and tabelts
>  # if any ranges could not be binned (e.g. a  tablet had no location) goto 1
>  # queue up work for tablet servers on a thread pool
>  # wait for thread pool to complete all work
>  # if there were any failures goto 1
> In the face of failures (tablets not assigned because of migration, tablet 
> servers dying) it would be better if the batch scanner worked on what it 
> could and immediately requeued failures for processing immediately.   The 
> ConditionalWriter and BatchWriter have failure queues and increase the retry 
> time if something keeps failing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-2837) Scanning empty table can be optimized

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-2837.

Resolution: Fixed

> Scanning empty table can be optimized
> -
>
> Key: ACCUMULO-2837
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2837
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: performance
> Fix For: 2.0.0
>
>
> I wrote an application that repeatedly scanned random rows in an empty table 
> at one point.  It did not know the table was empty.  I was jstacking the 
> tserver and repeatedly saw stack traces in a few places in the code. 
>  * the audit canScan method
>  * DefaultConfiguration.get
>  * Kept seeing Password token ungziping the password??
> After fixing all of these issues (except for system token) I ran a simple 
> test I saw times go from ~1.7s to ~1.3s for a simple test I wrote to recreate 
> this issue.
> I am not sure how much of an impact the gziping the password has.  I was just 
> sampling the stack traces manually.  I seems like I saw the password token 
> gzip stack traces less frequently, but I did catch it multiple jstack calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-2837) Scanning empty table can be optimized

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-2837:
---
Fix Version/s: (was: 2.0.0)

> Scanning empty table can be optimized
> -
>
> Key: ACCUMULO-2837
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2837
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.6.0
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: performance
>
> I wrote an application that repeatedly scanned random rows in an empty table 
> at one point.  It did not know the table was empty.  I was jstacking the 
> tserver and repeatedly saw stack traces in a few places in the code. 
>  * the audit canScan method
>  * DefaultConfiguration.get
>  * Kept seeing Password token ungziping the password??
> After fixing all of these issues (except for system token) I ran a simple 
> test I saw times go from ~1.7s to ~1.3s for a simple test I wrote to recreate 
> this issue.
> I am not sure how much of an impact the gziping the password has.  I was just 
> sampling the stack traces manually.  I seems like I saw the password token 
> gzip stack traces less frequently, but I did catch it multiple jstack calls.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4823) Run autorefactor

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4823.

Resolution: Won't Fix

> Run autorefactor
> 
>
> Key: ACCUMULO-4823
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4823
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> Could look into running this tool on Accumulo source code.
>  
> http://autorefactor.org/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4837) Allow short service names in addition to class names.

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4837.

Resolution: Duplicate

https://github.com/apache/accumulo/issues/1124

> Allow short service names in addition to class names.
> -
>
> Key: ACCUMULO-4837
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4837
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
>
> In 2.0.0-SNAPSHOT the cache implementation was made configurable.  Currently 
> to configure it, you set a property like the following.
> {noformat}
>  
> tserver.cache.manager.class=org.apache.accumulo.core.file.blockfile.cache.tinylfu.TinyLfuBlockCacheManager
> {noformat}
> I would much rather be able to provide a short service name like the 
> following when configuring the cache.  However I do not want the list to be 
> predefined, I will want the user to be able to provide implementations.
> {noformat}
>   tserver.cache.implementation=tinylfu
> {noformat}
> What is a good way to do this? Is there a good reason not do this and just 
> stick with class names only?  I was also thinking it may be nice to have a 
> shell command for listing services, but this could be done independently.
> One way I thought of doing this is having an interface like the the following 
> that services (balancer, compaction strategy, cache, etc) could implement.
> {code:java}
> public interface AccumuloService {
>   /**
>* A human readable, short, unique identification that can be specified in 
> configuration to identify a service implementation.
>*/
>   public String getName();
>   public static  C load(String configId, Class 
> serviceType, ClassLoader classLoader) {
> ServiceLoader services = ServiceLoader.load(serviceType, classLoader);
> for (C service : services) {
>   if(service.getName().equals(configId) || 
> service.getClass().getName().equals(configId)) {
> return service;
>   }
> }
> return null;
>   }
> }
> {code}
> Then the cache implementation could provide a name
> {code:java}
> //assume BlockCacheManager implements AccumuloService
> public class TinyLfuBlockCacheManager extends BlockCacheManager {
>   private static final Logger LOG = 
> LoggerFactory.getLogger(TinyLfuBlockCacheManager.class);
>   @Override
>   protected TinyLfuBlockCache createCache(Configuration conf, CacheType type) 
> {
> LOG.info("Creating {} cache with configuration {}", type, conf);
> return new TinyLfuBlockCache(conf, type);
>   }
>   @Override
>   public String getName() {
> return "tinylfu";
>   }
> }
> {code}
> The code to load a block cache impl would look like the following :
> {code:java}
> String impl = conf.get(Property.TSERV_CACHE_MANAGER_IMPL);
> BlockCacheManager bcm = AccumuloService.load(impl, 
> BlockCacheManager.class, AccumuloVFSClassLoader.getClassLoader());
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4822) Remove observers from configuration

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4822.

Resolution: Duplicate

https://github.com/apache/accumulo/issues/1123

> Remove observers from configuration
> ---
>
> Key: ACCUMULO-4822
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4822
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> As part of the work done for ACCUMULO-4779 the method getUpdateCount was 
> added to AccumuloConfiguration.  Subclasses of AccumuloConfiguration increase 
> a counter each time it changes and this counter is available via 
> getUpdateCount.  Anything that is derived from configuration can be cached 
> and this counter can be used to know when it needs to be derived again.
> Some AccumuloConfigurations are also observable.  I think this observer 
> pattern could be dropped in favor of using this counter.  I think this change 
> would simplify the code.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4821) Make thrift transport pool more concurrent

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4821.

Resolution: Duplicate

https://github.com/apache/accumulo/issues/1122

> Make thrift transport pool more concurrent
> --
>
> Key: ACCUMULO-4821
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4821
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
>
> The thrift transport pool has a global lock, it would be nice to remove this 
> and replace it with something more concurrent.  See ACCUMULO-4788



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-1280) Add close method to iterators

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-1280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-1280.

Resolution: Won't Fix

> Add close method to iterators
> -
>
> Key: ACCUMULO-1280
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1280
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> It would be useful if Accumulo iterators had a close method.  Accumulo would 
> call this when its finished using the iterator stack.
> How would this work w/ isolation?
> Is it ok to break the iterator API?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-2272) Refactor metadata operations into a common API

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-2272.

Resolution: Duplicate

Work was already done for 2.0.0 to add an abstraction layer for reading from 
the metadata table.  The following issue is about writing to the metadata table.

https://github.com/apache/accumulo/issues/816

> Refactor metadata operations into a common API
> --
>
> Key: ACCUMULO-2272
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2272
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Christopher Tubbs
>Priority: Major
> Fix For: 2.0.0
>
>
> We have a lot of code that updates/modifies entries in the metadata tables, 
> and provides metadata information for tablets to clients. It'd be better 
> (code readability, modifiability) if we could abstract these metadata 
> operations into a usable API, rather than have separate code spread out for 
> the different kinds of metadata (zookeeper for the root table, root table for 
> the metadata tablets, metadata table for user tablets).
> A single API, with a factory to get the right implementation, depending on 
> which table's metadata is being manipulated, would be much easier to work 
> with and would help avoid bugs related to tablet management, updating table 
> state, etc.
> A minimal API has been added (o.a.a.core.metadata.MetadataServicer) when the 
> root tablet was moved to its own table, as a starting point, but was not 
> fully leveraged due to time constraints.
> To be clear, this improvement is entirely refactoring of internal code. User 
> experience should have no impact (unless it helps find/prevent bugs).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-2268) Use conditional mutations to update metadata table

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-2268.

Resolution: Duplicate

https://github.com/apache/accumulo/issues/1121

> Use conditional mutations to update metadata table
> --
>
> Key: ACCUMULO-2268
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2268
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> For correctness Accumulo requires that only one tablet server at a time serve 
> a tablet.   In order to enforce this constraint, Accumulo uses zookeeper 
> locks.  It's assumed when a tablet server lock disappears that the tablet 
> server will kill itself.  Therefore a tablet that's assigned to a dead tablet 
> server can be safely reassigned.  However sometimes tablet servers continue 
> to operate for a period of time after losing their locks.  Sometimes this is 
> caused by bugs in Accumulo, sometimes it's the Java GC or swapping (and the 
> tserver does die), sometimes it's problems with zookeeper (like the zk thread 
> that reports lock lost dies).
> In Accumulo 1.6 conditional mutations were added.  Making all tablet metadata 
> updates use conditional mutations could make multiply-assigned tablets less 
> able to do damage.   
> For example if after a minor compaction, the metadata update mutation could 
> require the tablet location to be the current tserver: it would prevent a 
> zombie tserver from adding an extraneous file to the metadata table for a 
> tablet.
> [~ctubbsii] has discussed refactoring all metadata code so that its more 
> modular and works with zookeeper (for root tablet) and metadata table using 
> same API.  This solution could depend on that.  It may also be useful to make 
> the root tablet operate more like a regular tablet and store its list of 
> files in zookeeper.  Then the root tablet could benefit from these changes 
> with the right abstraction layer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4154) Improve batch writer

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4154.

Resolution: Duplicate

https://github.com/apache/accumulo/issues/1120

> Improve batch writer
> 
>
> Key: ACCUMULO-4154
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4154
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
>
> The batch writer currently has two drawbacks :
>  * It waits for its memory to be half full and then bins mutations for send 
> threads.  I don't think this is optimal.   Think it would be better to keep 
> the send threads busy.  As soon as there are mutation start working on them. 
> If the send threads can not keep up, then work will naturally build up (w/o 
> waiting for memory to be .5 full)
>  * The flush method blocks threads trying to add anything to the batch writer.
> Thinking of implementing the following model for the batch writer, which is 
> similar to how the conditional writer works.
>   * Have a queue that all incoming mutations are added to.
>   * Have a queue per tablet server
>   * Have a single thread thats constantly taking batches of mutations off the 
> incoming queue, binning them, and placing them on tablet server queues.
>   * When a send thread becomes idle, have it select and reserver the tablet 
> server queue with the most work on it.
>   * when mutations fail, send threads can add them back to the incoming queue
> To get better flushing behavior, as each mutation is added to the batch 
> writer it can be assigned a one up counter.   We can keep track of the 
> minimum in progress mutation.  Flush can inspect this counter and wait for 
> the minimum active mutation to reach a certain count.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-3501) Attempt to make read of walog for replication local

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-3501.

Resolution: Won't Fix

> Attempt to make read of walog for replication local
> ---
>
> Key: ACCUMULO-3501
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3501
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
>
> Replication schedules a task on random tserver to read data from a walog for 
> replication.  We could attempt to have that work occur on a tserver where the 
> walog is local.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-2960) investigate failure while adding walog to tablet

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-2960.

Resolution: Not A Problem

Because of the changes to how walogs work in 1.8.0 I suspect this is no longer 
a problem.

> investigate failure while adding walog to tablet
> 
>
> Key: ACCUMULO-2960
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2960
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1, 1.6.0
>Reporter: Keith Turner
>Priority: Major
>
> While reviewing ACCUMULO-2889 I noticed a possible problem w/ how walogs are 
> added to a table.  The following steps are taken
>  # add walog to tablets inmemory data structs
>  # define tablet in walog
>  # reference tablet in metadata table
> It seems like if there is an exception after step 1, that the tablet will 
> think the walog is referenced in the metadata table when its not.  This could 
> lead to data loss.  However it may be that steps 2 and 3 retry so 
> aggressively that this never happens in practice?
> Could possible update the tablets in memory data structs after steps 2 and 3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-2801) define tablet syncs walog for each tablet in a batch

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-2801.

Resolution: Duplicate

> define tablet syncs walog for each tablet in a batch
> 
>
> Key: ACCUMULO-2801
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2801
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.5.0, 1.5.1, 1.6.0
>Reporter: Keith Turner
>Priority: Major
>
> When the batch writer sends a batch of mutations for N tablets that were not 
> currently using a walog, then define tablet will be called for each tablet.  
> Define tablet will sync the walog.   In hadoop 2 hsync is used, which is much 
> slower than hadoop1 sync calls.  If hsync takes 50ms and there are 100 
> tablets, then this operation would take 5 secs.  The calls to define tablet 
> do not occur frequently, just when walogs switch or tablets are loaded so the 
> cost will be amortized.  Ideally there could be one walog sync call for all 
> of the tablets in a batch of mutations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ACCUMULO-4395) MiniAccumulo is not working with sbt.

2019-04-23 Thread Keith Turner (JIRA)


[ 
https://issues.apache.org/jira/browse/ACCUMULO-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16824224#comment-16824224
 ] 

Keith Turner edited comment on ACCUMULO-4395 at 4/23/19 3:13 PM:
-

I suspect the following work address this issue.

https://github.com/apache/accumulo/issues/942
https://github.com/apache/accumulo/issues/963


was (Author: kturner):
I suspect the following work address this issue.

https://github.com/apache/accumulo/issues/924
https://github.com/apache/accumulo/issues/963

> MiniAccumulo is not working with sbt.
> -
>
> Key: ACCUMULO-4395
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4395
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
>Priority: Major
>
> As outlined [on the user 
> list|https://lists.apache.org/thread.html/907d6d868bc9bfd7d700289cb8542b2a90dc1e99e74db7b7cb96fc2d@%3Cuser.accumulo.apache.org%3E]
>  MiniAccumulo is not working with [SBT|http://www.scala-sbt.org/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4395) MiniAccumulo is not working with sbt.

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4395.

Resolution: Duplicate

I suspect the following work address this issue.

https://github.com/apache/accumulo/issues/924
https://github.com/apache/accumulo/issues/963

> MiniAccumulo is not working with sbt.
> -
>
> Key: ACCUMULO-4395
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4395
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
>Priority: Major
>
> As outlined [on the user 
> list|https://lists.apache.org/thread.html/907d6d868bc9bfd7d700289cb8542b2a90dc1e99e74db7b7cb96fc2d@%3Cuser.accumulo.apache.org%3E]
>  MiniAccumulo is not working with [SBT|http://www.scala-sbt.org/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-2175) Batch defining tablets in walog

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-2175.

Resolution: Duplicate

https://github.com/apache/accumulo/issues/1118

> Batch defining tablets in walog
> ---
>
> Key: ACCUMULO-2175
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2175
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> If a batch of mutations comes into a tablet server AND the tablet server just 
> got a new walog then it will sync the walog for each tablet.  Below is a 
> sketch of what the tablet server currently does.
> {code:java}
> foreach(Tablet t : tabletsInMutationBatch){
> if(!tabletIsDefinedInWalog(t, currentWalog)){
> defineTablet(currentWalog, t); //syncs walog
> addWalogToMetadataTable(currentWalog, t); //syncronous metadata table 
> update
>  }
> }
> {code}
> Seems like doing the following would be better.  Then  no matter how many 
> undefined tablets there are, only one walog sync would be done.
> {code:java}
> foreach(Tablet t : tabletsInMutationBatch){
> Set undefined = new HashSet();
> if(!tabletIsDefinedInWalog(t, currentWalog)){
> undefined.add(t);
>  }
> }
> defineTablets(currentWalog, undefined); //syncs walog after writing all 
> definitions
> addWalogToMetadataTable(currentWalog, undefined); //syncronous metadata table 
> batch write
> {code}
> There is not problem when all tablets in a batch update are defined in the 
> walog. In this case a batch update that contains multiple tablet will only 
> sync the log once after adding all the mutations from all tablets.
> Noticed this while looking into ACCUMULO-2172



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4166) Limit compactions per tserver

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4166.

Resolution: Duplicate

https://github.com/apache/accumulo/issues/564

> Limit compactions per tserver
> -
>
> Key: ACCUMULO-4166
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4166
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> For user initiated compactions, I would like to be able to limit the number 
> of threads that run compactions on each tablet server.  My use case is that 
> for Fluo tables I need to periodically compact ranges of a table to garbage 
> collect Fluo transactional data.  However, I would like these compactions to 
> have minimal impact.
> Being able to do the following would be nice, where the {{-m}} options is the 
> max number of compactions to run on any tablet server.
> {noformat}
>compact -t myTable  -b p: -e p:~  -m 1
> {noformat}
> Thinking the best implementation would be to push this information to the 
> tablet server and let it manage, rather than having the master try to 
> coordinate.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-1635) Support external configuration in client API

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-1635.

Resolution: Duplicate

https://github.com/apache/accumulo/pull/636

> Support external configuration in client API
> 
>
> Key: ACCUMULO-1635
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1635
> Project: Accumulo
>  Issue Type: Sub-task
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently things like timeout, batch writer max mem, etc are only 
> configurable in the API by making method calls.  It would be nice if a user 
> could pass a config file to the connector that provides default 
> configutations for things like timeout, batch writer max memory, etc.
> Since Accumulo is used on a cluster, in addition to a config file, supporting 
> configuration profiles in zookeeper would also be nice.  For example I can 
> tell my application to use configuration profile X in zookeeper.   If I do 
> not specify a batch writer max mem, then it will look in the configuration 
> profile in zookeeper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-1021) Provide default key management thats secure

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-1021.

Resolution: Won't Fix

> Provide default key management thats secure
> ---
>
> Key: ACCUMULO-1021
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1021
> Project: Accumulo
>  Issue Type: New Feature
>  Components: master, tserver
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> There are a few tickets to support encrypting data at rest in Accumulo.   
> Encryption in a cluster is useless w/o good key management.   Users should 
> have the ability to plug in their own key managment.  Out of the box Accumulo 
> should provide a plugin for key management thats secure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-3499) Use tablename instead of table id in configuration of replication

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-3499.

Resolution: Won't Fix

> Use tablename instead of table id in configuration of replication
> -
>
> Key: ACCUMULO-3499
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3499
> Project: Accumulo
>  Issue Type: Improvement
>  Components: replication
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently replication expects users to set the table id of the table on the 
> remote system.   Would be more user friendly if users provided a table name 
> and the implementation resolved that to a table id and stored the table id in 
> Accumulo config.   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-3935) Not easy to configure server side extension using NewTableConfiguration

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-3935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-3935.

Resolution: Duplicate

> Not easy to configure server side extension using NewTableConfiguration
> ---
>
> Key: ACCUMULO-3935
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3935
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> Using {{NewTableConfiguration}} iterators, constraints, and locality groups 
> can all be configured before a new table comes online.   For existing tables 
> there are convenience methods that help set the properties for these things.  
>  However there is no way to leverage these convenience methods with 
> {{NewTableConfiguration}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-3998) Add sampling support to proxy

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-3998.

Resolution: Won't Fix

> Add sampling support to proxy
> -
>
> Key: ACCUMULO-3998
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3998
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> Need to add support for sampling to thrift proxy API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-2007) Compaction and flush of root table(t) does not work

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-2007.

Resolution: Duplicate

https://github.com/apache/accumulo/issues/798

> Compaction and flush of root table(t) does not work
> ---
>
> Key: ACCUMULO-2007
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2007
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> Compating and flushing the root table[t] does not work properly.  In the case 
> of flush, I think it will most likely initiate the flush but will not wait 
> for it.  In the case if compaction, I am not sure if it will even initiate 
> the compaction.  
> The root table tablet needs to store its flush and compaction count in 
> zookeeper.   The master would need to wait for it to increment in zookeeper.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-1902) DefaultSecretKeyEncryptionStrategy will not work w/ multiple namenodes

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-1902.

Resolution: Won't Fix

> DefaultSecretKeyEncryptionStrategy will not work w/ multiple namenodes
> --
>
> Key: ACCUMULO-1902
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1902
> Project: Accumulo
>  Issue Type: Bug
>  Components: master, tserver
>Reporter: Keith Turner
>Assignee: Michael Allen
>Priority: Major
> Fix For: 2.0.0
>
>
> Code in DefaultSecretKeyEncryptionStrategy gets the default filesystem.  If a 
> fully qualified URL is passed in thats references a different namenode than 
> the default, then an exception will be thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4592) Add since information to properties

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4592.

Resolution: Duplicate

https://github.com/apache/accumulo/issues/1117

> Add since information to properties
> ---
>
> Key: ACCUMULO-4592
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4592
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
>
> It would be very useful to document, in the user manual, the version in which 
> each property was added.  Not sure what the best way to do this is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4573) Writing a correct compaction strategy is difficult.

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4573.

Resolution: Duplicate

https://github.com/apache/accumulo/issues/564

> Writing a correct compaction strategy is difficult.
> ---
>
> Key: ACCUMULO-4573
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4573
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.6.6, 1.7.2, 1.8.0
>Reporter: Keith Turner
>Priority: Major
>
> Compaction strategies have two methods that are not supposed to block.  If 
> someone does write a strategy that blocks, it can cause scans to block.   
> Even though its documents in the javadoc, this very tricky to get right.  
> Would be better to change interaction so that its not harmful if strategy 
> blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4838) Create SPI package

2019-04-23 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4838.

Resolution: Fixed

> Create SPI package
> --
>
> Key: ACCUMULO-4838
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4838
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
>
> Accumulo has multiple pluggable services.  It would be nice if the SPIs 
> (service provider interface) for these were in one package.  For existing 
> service SPIs, this can not be easily done.  However, for new service SPIs it 
> would be nice to start putting them under a single package.  This package 
> could be {{org.apache.accumulo.core.spi}}.  Currently there are at least two 
> new unreleased SPIs for caching and summarization.  These could be moved to 
> the new package.
> For existing SPIs could possibly do the following :
>  * Create a new SPI in the new package
>  * Make existing SPI extend new SPI and deprecate it
> The contents of this package could be analyzed by APILyzer to ensure only API 
> and SPI types are used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4861) Log recovery can not succeed after failed marker written in recovery directory

2019-02-14 Thread Keith Turner (JIRA)


[ 
https://issues.apache.org/jira/browse/ACCUMULO-4861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16768704#comment-16768704
 ] 

Keith Turner commented on ACCUMULO-4861:


I opened a github issue and added a link to it.

> Log recovery can not succeed after failed marker written in recovery directory
> --
>
> Key: ACCUMULO-4861
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4861
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Reporter: Bill Oley
>Priority: Minor
>
> Accumulo version 1.8.1.  When tserver log recovery fails in LogSorter.sort() 
> and a FailedMarkerPath (failed) file is written to the recovery directory, 
> subsequent recoveries fail due to the presence of this file.  In subsequent 
> recovery attempts, when the subdirectories of the recovery root directory are 
> iteratated over in the RecoveryLogReader constructor, the failed marker is 
> interpreted as any other partial results directory and an exception is 
> generated when the underlying data can not be read.  This happens even when a 
> finished marker exists (both were present with the same timestamp - another 
> issue?) because failed comes before finished alphabetically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-999) percolator

2018-10-05 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-999.
---
Resolution: Won't Fix

This was done as an external project.

https://fluo.apache.org

> percolator
> --
>
> Key: ACCUMULO-999
> URL: https://issues.apache.org/jira/browse/ACCUMULO-999
> Project: Accumulo
>  Issue Type: New Feature
>  Components: tserver
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Minor
>
> Add a percolator like capability to Accumulo that support continuous data 
> transformation and analysis.
> http://research.google.com/pubs/pub36726.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-3806) Failing to create a table/namespace because it already exists should not be a warning

2018-07-24 Thread Keith Turner (JIRA)


[ 
https://issues.apache.org/jira/browse/ACCUMULO-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16554278#comment-16554278
 ] 

Keith Turner commented on ACCUMULO-3806:


I have seen code where this is expected behavior.  Like launching lots of 
processes that all attempt to create a table or an app that always attempts to 
create a table on startup.  If ACCUMULO-3925 has not already handled this, I do 
not think it should create a warning.  The client code calling create table 
gets an exception and it knows exactly what happened based on the exception.

> Failing to create a table/namespace because it already exists should not be a 
> warning
> -
>
> Key: ACCUMULO-3806
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3806
> Project: Accumulo
>  Issue Type: Improvement
>  Components: fate
>Reporter: Josh Elser
>Priority: Major
>  Labels: newbie
> Fix For: 2.0.0
>
> Attachments: 
> 0001-ACCUMULO-3806-changed-checkTableDoesNotExist-in-accu.patch
>
>
> This is a really common occurrence when you're running randomwalk:
> {noformat}
> Failed to execute Repo, tid=63d0421f1b17b04a
>   ThriftTableOperationException(tableId:null, tableName:nspc_001.ctt_000, 
> op:CREATE, type:EXISTS, description:null)
>   at 
> org.apache.accumulo.master.tableOps.Utils.checkTableDoesNotExist(Utils.java:54)
>   at 
> org.apache.accumulo.master.tableOps.PopulateZookeeper.call(PopulateZookeeper.java:54)
>   at 
> org.apache.accumulo.master.tableOps.PopulateZookeeper.call(PopulateZookeeper.java:30)
>   at 
> org.apache.accumulo.master.tableOps.TraceRepo.call(TraceRepo.java:57)
>   at 
> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Concurrent table creations run: only one succeeds and the others fail. This 
> is expected and what FATE was designed to handle. We shouldn't be pushing 
> these up to the monitor -- should probably be a info or debug message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-3806) Failing to create a table/namespace because it already exists should not be a warning

2018-07-23 Thread Keith Turner (JIRA)


[ 
https://issues.apache.org/jira/browse/ACCUMULO-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553090#comment-16553090
 ] 

Keith Turner commented on ACCUMULO-3806:


Does ACCUMULO-3925 address this issue with the introduction of 
{{AcceptableThriftTableOperationException.java}}?

> Failing to create a table/namespace because it already exists should not be a 
> warning
> -
>
> Key: ACCUMULO-3806
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3806
> Project: Accumulo
>  Issue Type: Improvement
>  Components: fate
>Reporter: Josh Elser
>Priority: Major
>  Labels: newbie
> Fix For: 2.0.0
>
> Attachments: 
> 0001-ACCUMULO-3806-changed-checkTableDoesNotExist-in-accu.patch
>
>
> This is a really common occurrence when you're running randomwalk:
> {noformat}
> Failed to execute Repo, tid=63d0421f1b17b04a
>   ThriftTableOperationException(tableId:null, tableName:nspc_001.ctt_000, 
> op:CREATE, type:EXISTS, description:null)
>   at 
> org.apache.accumulo.master.tableOps.Utils.checkTableDoesNotExist(Utils.java:54)
>   at 
> org.apache.accumulo.master.tableOps.PopulateZookeeper.call(PopulateZookeeper.java:54)
>   at 
> org.apache.accumulo.master.tableOps.PopulateZookeeper.call(PopulateZookeeper.java:30)
>   at 
> org.apache.accumulo.master.tableOps.TraceRepo.call(TraceRepo.java:57)
>   at 
> org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Concurrent table creations run: only one succeeds and the others fail. This 
> is expected and what FATE was designed to handle. We shouldn't be pushing 
> these up to the monitor -- should probably be a info or debug message.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-3510) Create mechanism to support priority based scheduling of read ahead tasks.

2018-07-10 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-3510:
---
Fix Version/s: (was: 2.0.0)

> Create mechanism to support priority based scheduling of read ahead tasks. 
> ---
>
> Key: ACCUMULO-3510
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3510
> Project: Accumulo
>  Issue Type: Improvement
>  Components: tserver
>Affects Versions: 1.6.0
>Reporter: marco polo
>Assignee: marco polo
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I have many cases where ScanSessions will consume resources that I otherwise 
> want shorter running scans to utilize. In some cases, a scan may continue for 
> hours, while a short running scan may come in and execute quickly. As a 
> result, I want to be able to adjust the priority of these scan sessions. 
> I have a patch which is forthcoming, that breaks Session out of TabletServer 
> and replaces the queue in the readAheadThreadPool with a priority pool. The 
> comparator I have created as a proof of concept, which can be adjustable, 
> reduces the priority of the oldest scan. Using an aging technique, we 
> guarantee execute of these older running scans based upon the previous run 
> time. As a result, we give preference to newer scans. If they execute 
> quickly, older scans will have an inherent rise in priority. If they also 
> take a while, their priority will be reduced and incoming scans will yet 
> again be given a greater priority with the intent (and/or hope ) their 
> execution will be faster.
> Priority should be configurable based on the desired Session Comparator. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-3510) Create mechanism to support priority based scheduling of read ahead tasks.

2018-07-10 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-3510.

Resolution: Duplicate

Work done in https://github.com/apache/accumulo/pull/549

> Create mechanism to support priority based scheduling of read ahead tasks. 
> ---
>
> Key: ACCUMULO-3510
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3510
> Project: Accumulo
>  Issue Type: Improvement
>  Components: tserver
>Affects Versions: 1.6.0
>Reporter: marco polo
>Assignee: marco polo
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I have many cases where ScanSessions will consume resources that I otherwise 
> want shorter running scans to utilize. In some cases, a scan may continue for 
> hours, while a short running scan may come in and execute quickly. As a 
> result, I want to be able to adjust the priority of these scan sessions. 
> I have a patch which is forthcoming, that breaks Session out of TabletServer 
> and replaces the queue in the readAheadThreadPool with a priority pool. The 
> comparator I have created as a proof of concept, which can be adjustable, 
> reduces the priority of the oldest scan. Using an aging technique, we 
> guarantee execute of these older running scans based upon the previous run 
> time. As a result, we give preference to newer scans. If they execute 
> quickly, older scans will have an inherent rise in priority. If they also 
> take a while, their priority will be reduced and incoming scans will yet 
> again be given a greater priority with the intent (and/or hope ) their 
> execution will be faster.
> Priority should be configurable based on the desired Session Comparator. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4074) create user-configurable resource pools for different kinds of requests

2018-07-10 Thread Keith Turner (JIRA)


[ 
https://issues.apache.org/jira/browse/ACCUMULO-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539291#comment-16539291
 ] 

Keith Turner commented on ACCUMULO-4074:


Work was done in https://github.com/apache/accumulo/pull/549

> create user-configurable resource pools for different kinds of requests
> ---
>
> Key: ACCUMULO-4074
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4074
> Project: Accumulo
>  Issue Type: Improvement
>  Components: client, tserver
>Reporter: Eric Newton
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Complex queries and iterator stacks can sometimes run for long periods of 
> time.  During that time, access to resources for shorter, simpler lookups can 
> be blocked.  Use separate resource pools to allow for simpler queries to be 
> able to run regardless.  This same mechanism could be used for the metadata 
> and root tables, too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4074) create user-configurable resource pools for different kinds of requests

2018-07-10 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4074.

Resolution: Duplicate

> create user-configurable resource pools for different kinds of requests
> ---
>
> Key: ACCUMULO-4074
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4074
> Project: Accumulo
>  Issue Type: Improvement
>  Components: client, tserver
>Reporter: Eric Newton
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Complex queries and iterator stacks can sometimes run for long periods of 
> time.  During that time, access to resources for shorter, simpler lookups can 
> be blocked.  Use separate resource pools to allow for simpler queries to be 
> able to run regardless.  This same mechanism could be used for the metadata 
> and root tables, too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4592) Add since information to properties

2018-07-05 Thread Keith Turner (JIRA)


[ 
https://issues.apache.org/jira/browse/ACCUMULO-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16533745#comment-16533745
 ] 

Keith Turner commented on ACCUMULO-4592:


Could also possibly add an optional deprecated since version parameter.

> Add since information to properties
> ---
>
> Key: ACCUMULO-4592
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4592
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
>
> It would be very useful to document, in the user manual, the version in which 
> each property was added.  Not sure what the best way to do this is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4629) Seeking in timestamp range is slow

2018-06-12 Thread Keith Turner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4629.

Resolution: Duplicate

> Seeking in timestamp range is slow
> --
>
> Key: ACCUMULO-4629
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4629
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> Fluo's internal schema uses the first 4 bits of the timestamp to store 
> different types of information per column.  These first 4 bits divide the 
> timestamp space up into 16 ranges.  Fluo has server side iterators that 
> consider information in one of these 16 ranges and then seek forward to 
> another of the 16 ranges.
> Unfortunately, Accumulo's built in iterator that processes delete marker 
> makes seeking within the timestamps of a column an O(N^2) operation.  This is 
> because of the way deletes work in Accumulo.  A delete marker for timestamp X 
> in Accumulo deletes anything with a timestamp <= X.  
> When seeking to timestamp Y, the Accumulo iterator that handles deletes will 
> scan from MAX_LONG to Y looking for any deletes that may keep you from seeing 
> data at timestamp Y.  The following example shows what the delete iterator 
> will do when a user iterator does some seeks.
>  * User iterator seeks to stamp 1,000,000.  This causes the delete iter to 
> scan from MAX_LONG to 1,000,000 looking for delete markers.
>  * User iterator seeks to stamp 900,000.  This causes the delete iter to scan 
> from MAX_LONG to 900,000 looking for delete markers.
>  * User iterator seeks to stamp 500,000.  This causes the delete iter to scan 
> from MAX_LONG to 500,000 looking for delete markers.
> So Fluo can seek efficiently, it has done some [serious 
> shenanigans|https://github.com/apache/incubator-fluo/blob/rel/fluo-1.0.0-incubating/modules/accumulo/src/main/java/org/apache/fluo/accumulo/iterators/TimestampSkippingIterator.java#L164]
>  using reflection to remove  the DeleteIterator.  The great work being done 
> on ACCUMULO-3079 will likely break this crazy reflection code.  So I would 
> like to make a change to upstream Accumulo that allows efficient seeking in 
> the timestamp range.  I have thought of the following two possible solutions.
>  * Make the DeleteIterator stateful so that it remember what ranges it has 
> scanned for deletes.  I don't really like this solution because it will add 
> expense to every seek in Accumulo for an edge case.
>  * Make it possible to create tables with an exact delete behavior. Meaning a 
> delete for timestamp X will only delete an existing row column with that 
> exact timestamp.  This option could only be chosen at table creation time and 
> could not be changed.  For this delete behavior, the delete iterator doesnot 
> need to scan for every seek.
> Are there other possible solutions?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4851) WAL recovery directory should be deleted before running LogSorter

2018-04-19 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16444394#comment-16444394
 ] 

Keith Turner commented on ACCUMULO-4851:


This is not a duplicate of #432

> WAL recovery directory should be deleted before running LogSorter
> -
>
> Key: ACCUMULO-4851
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4851
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Critical
>
> Noticed this one on a user's 1.7-ish system.
> A number of tablets (~9) were unassigned and reported on the Monitor as 
> having failed to load. Digging into the exception, we could see the tablet 
> load failed due to a FileNotFoundException:
> {noformat}
> 2018-04-09 19:57:08,475 [tserver.TabletServer] WARN : exception trying to 
> assign tablet xk;... /accumulo/tables/xk/t-00pyzd0
> java.lang.RuntimeException: java.io.IOException: 
> java.io.FileNotFoundException: File does not exist: 
> /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
>     at org.apache.accumulo.tserver.tablet.Tablet.(Tablet.java:640)
>     at org.apache.accumulo.tserver.tablet.Tablet.(Tablet.java:449)
>     at 
> org.apache.accumulo.tserver.TabletServer$AssignmentHandler.run(TabletServer.java:2156)
>     at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>     at 
> org.apache.accumulo.tserver.ActiveAssignmentRunnable.run(ActiveAssignmentRunnable.java:61)
>     at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.io.FileNotFoundException: File does not 
> exist: /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
>     at 
> org.apache.accumulo.tserver.log.TabletServerLogger.recover(TabletServerLogger.java:480)
>     at 
> org.apache.accumulo.tserver.TabletServer.recover(TabletServer.java:3012)
>     at org.apache.accumulo.tserver.tablet.Tablet.(Tablet.java:590)
>     ... 9 more
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1446)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1454)
>     at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1823)
>     at 
> org.apache.hadoop.io.MapFile$Reader.createDataFileReader(MapFile.java:456)
>     at org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:429)
>     at org.apache.hadoop.io.MapFile$Reader.(MapFile.java:399)
>     at 
> org.apache.accumulo.tserver.log.MultiReader.(MultiReader.java:113)
>     at 
> org.apache.accumulo.tserver.log.SortedLogRecovery.recover(SortedLogRecovery.java:105)
>     at 
> org.apache.accumulo.tserver.log.TabletServerLogger.recover(TabletServerLogger.java:478)
>     ... 11 more
> 2018-04-09 19:57:08,476 [tserver.TabletServer] WARN : java.io.IOException: 
> java.io.FileNotFoundException: File does not exist: 
> /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
> 2018-04-09 19:57:08,476 [tserver.TabletServer] WARN : failed to open tablet 
> xk;... reporting failure to master
> 2018-04-09 19:57:08,476 [tserver.TabletServer] WARN : rescheduling tablet 
> load in 600.00 seconds
> {noformat}
> Upon further investigation of the recovery directory in HDFS for this WAL, we 
> find the following:
> {noformat}
> $ hdfs dfs -ls -R /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/
> -rwxr--r--   3 accumulo hdfs  0 2018-04-06 22:12 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed
> -rwxr--r--   3 accumulo hdfs  0 2018-04-06 22:10 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/finished
> drwxr-xr-x   - accumulo hdfs  0 2018-04-06 22:09 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/part-r-0
> -rw-r--r--   3 accumulo hdfs    8040761 2018-04-06 22:09 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/part-r-0/data
> -rw-r--r--   3 accumulo hdfs    642 2018-04-06 22:09 
> 

[jira] [Reopened] (ACCUMULO-4851) WAL recovery directory should be deleted before running LogSorter

2018-04-19 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner reopened ACCUMULO-4851:


> WAL recovery directory should be deleted before running LogSorter
> -
>
> Key: ACCUMULO-4851
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4851
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Critical
>
> Noticed this one on a user's 1.7-ish system.
> A number of tablets (~9) were unassigned and reported on the Monitor as 
> having failed to load. Digging into the exception, we could see the tablet 
> load failed due to a FileNotFoundException:
> {noformat}
> 2018-04-09 19:57:08,475 [tserver.TabletServer] WARN : exception trying to 
> assign tablet xk;... /accumulo/tables/xk/t-00pyzd0
> java.lang.RuntimeException: java.io.IOException: 
> java.io.FileNotFoundException: File does not exist: 
> /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
>     at org.apache.accumulo.tserver.tablet.Tablet.(Tablet.java:640)
>     at org.apache.accumulo.tserver.tablet.Tablet.(Tablet.java:449)
>     at 
> org.apache.accumulo.tserver.TabletServer$AssignmentHandler.run(TabletServer.java:2156)
>     at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>     at 
> org.apache.accumulo.tserver.ActiveAssignmentRunnable.run(ActiveAssignmentRunnable.java:61)
>     at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.io.FileNotFoundException: File does not 
> exist: /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
>     at 
> org.apache.accumulo.tserver.log.TabletServerLogger.recover(TabletServerLogger.java:480)
>     at 
> org.apache.accumulo.tserver.TabletServer.recover(TabletServer.java:3012)
>     at org.apache.accumulo.tserver.tablet.Tablet.(Tablet.java:590)
>     ... 9 more
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1446)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1454)
>     at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1823)
>     at 
> org.apache.hadoop.io.MapFile$Reader.createDataFileReader(MapFile.java:456)
>     at org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:429)
>     at org.apache.hadoop.io.MapFile$Reader.(MapFile.java:399)
>     at 
> org.apache.accumulo.tserver.log.MultiReader.(MultiReader.java:113)
>     at 
> org.apache.accumulo.tserver.log.SortedLogRecovery.recover(SortedLogRecovery.java:105)
>     at 
> org.apache.accumulo.tserver.log.TabletServerLogger.recover(TabletServerLogger.java:478)
>     ... 11 more
> 2018-04-09 19:57:08,476 [tserver.TabletServer] WARN : java.io.IOException: 
> java.io.FileNotFoundException: File does not exist: 
> /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
> 2018-04-09 19:57:08,476 [tserver.TabletServer] WARN : failed to open tablet 
> xk;... reporting failure to master
> 2018-04-09 19:57:08,476 [tserver.TabletServer] WARN : rescheduling tablet 
> load in 600.00 seconds
> {noformat}
> Upon further investigation of the recovery directory in HDFS for this WAL, we 
> find the following:
> {noformat}
> $ hdfs dfs -ls -R /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/
> -rwxr--r--   3 accumulo hdfs  0 2018-04-06 22:12 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed
> -rwxr--r--   3 accumulo hdfs  0 2018-04-06 22:10 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/finished
> drwxr-xr-x   - accumulo hdfs  0 2018-04-06 22:09 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/part-r-0
> -rw-r--r--   3 accumulo hdfs    8040761 2018-04-06 22:09 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/part-r-0/data
> -rw-r--r--   3 accumulo hdfs    642 2018-04-06 22:09 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/part-r-0/index
> drwxr-xr-x   - 

[jira] [Resolved] (ACCUMULO-4836) Tables do not always wait for online or offline

2018-03-19 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4836.

Resolution: Fixed

> Tables do not always wait for online or offline
> ---
>
> Key: ACCUMULO-4836
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4836
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.3
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.4, 1.9.0, 2.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> While investigating why TabletStateChangeIteratorIT it was discovered that 
> online table with wait=true does not always wait.  The test relied on this 
> API to wait and that is why it was failing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-3545) Mappers not running locally

2018-03-07 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-3545:
---
Fix Version/s: (was: 1.7.4)
   (was: 1.9.0)
   (was: 2.0.0)

> Mappers not running locally
> ---
>
> Key: ACCUMULO-3545
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3545
> Project: Accumulo
>  Issue Type: Bug
> Environment: Hadoop 2.6.0, ZK 3.4.5, Centos 6
>Reporter: Keith Turner
>Priority: Minor
>
> I ran CI to test 1.6.2RC3 on a 20 node EC2 cluster.  After it ran for 24hr I 
> stopped ingest and ran the M/R verify job.   Based on running {{listscans}} 
> in the shell I could see mappers were not running locally.  I saw multiple 
> error message like the following when the M/R job started.
> {noformat}
> 15/01/29 22:14:42 WARN split.JobSplitWriter: Max block location exceeded for 
> split: Range: [14b5%00; : [] 9223372036854775807 false,1696969696969698%00; : 
> [] 9223372036854775807 false) Locations: [ip-10-1-2-21.ec2.internal, 
> ip-10-1-2-21.ec2.internal, ip-10-1-2-15.ec2.internal, 
> ip-10-1-2-13.ec2.internal, ip-10-1-2-16.ec2.internal, 
> ip-10-1-2-18.ec2.internal, ip-10-1-2-18.ec2.internal, 
> ip-10-1-2-18.ec2.internal, ip-10-1-2-18.ec2.internal, 
> ip-10-1-2-18.ec2.internal, ip-10-1-2-18.ec2.internal, 
> ip-10-1-2-18.ec2.internal, ip-10-1-2-27.ec2.internal, 
> ip-10-1-2-27.ec2.internal, ip-10-1-2-27.ec2.internal, 
> ip-10-1-2-27.ec2.internal, ip-10-1-2-27.ec2.internal, 
> ip-10-1-2-27.ec2.internal, ip-10-1-2-27.ec2.internal, 
> ip-10-1-2-27.ec2.internal, ip-10-1-2-27.ec2.internal, 
> ip-10-1-2-27.ec2.internal, ip-10-1-2-28.ec2.internal, 
> ip-10-1-2-28.ec2.internal, ip-10-1-2-20.ec2.internal, 
> ip-10-1-2-17.ec2.internal, ip-10-1-2-25.ec2.internal, 
> ip-10-1-2-25.ec2.internal, ip-10-1-2-25.ec2.internal, 
> ip-10-1-2-25.ec2.internal, ip-10-1-2-25.ec2.internal, 
> ip-10-1-2-25.ec2.internal] Table: ci TableID: 2 InstanceName: accumulo 
> zooKeepers: 10.1.2.10,10.1.2.11,10.1.2.12 principal: root tokenSource: INLINE 
> authenticationToken: 
> org.apache.accumulo.core.client.security.tokens.PasswordToken@fee189f1 
> authenticationTokenFile: null Authorizations:  offlineScan: false 
> mockInstance: false isolatedScan: false localIterators: false fetchColumns: 
> [] iterators: [] logLevel: INFO splitsize: 32 maxsize: 10
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4592) Add since information to properties

2018-03-07 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4592:
---
Fix Version/s: (was: 1.7.4)
   (was: 1.9.0)
   (was: 2.0.0)

> Add since information to properties
> ---
>
> Key: ACCUMULO-4592
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4592
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
>
> It would be very useful to document, in the user manual, the version in which 
> each property was added.  Not sure what the best way to do this is.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4811) Session manager does not always act on cleanUp() return

2018-03-07 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4811:
---
Fix Version/s: (was: 1.7.4)
   (was: 1.9.0)
   (was: 2.0.0)

> Session manager does not always act on cleanUp() return
> ---
>
> Key: ACCUMULO-4811
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4811
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Priority: Major
>
> While working on ACCUMULO-4782 I noticed that the session manager does not 
> always look at the return value of session.cleanUp().  It seems like it 
> should always do something when false is returned. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ACCUMULO-4836) Tables do not always wait for online or offline

2018-03-07 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner reassigned ACCUMULO-4836:
--

Assignee: Keith Turner

> Tables do not always wait for online or offline
> ---
>
> Key: ACCUMULO-4836
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4836
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.3
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.7.4, 1.9.0, 2.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> While investigating why TabletStateChangeIteratorIT it was discovered that 
> online table with wait=true does not always wait.  The test relied on this 
> API to wait and that is why it was failing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ACCUMULO-4838) Create SPI package

2018-03-01 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4838:
--

 Summary: Create SPI package
 Key: ACCUMULO-4838
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4838
 Project: Accumulo
  Issue Type: Improvement
Reporter: Keith Turner


Accumulo has multiple pluggable services.  It would be nice if the SPIs 
(service provider interface) for these were in one package.  For existing 
service SPIs, this can not be easily done.  However, for new service SPIs it 
would be nice to start putting them under a single package.  This package could 
be {{org.apache.accumulo.core.spi}}.  Currently there are at least two new 
unreleased SPIs for caching and summarization.  These could be moved to the new 
package.

For existing SPIs could possibly do the following :
 * Create a new SPI in the new package
 * Make existing SPI extend new SPI and deprecate it

The contents of this package could be analyzed by APILyzer to ensure only API 
and SPI types are used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ACCUMULO-4837) Allow short service names in addition to class names.

2018-03-01 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4837:
--

 Summary: Allow short service names in addition to class names.
 Key: ACCUMULO-4837
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4837
 Project: Accumulo
  Issue Type: Improvement
Reporter: Keith Turner


In 2.0.0-SNAPSHOT the cache implementation was made configurable.  Currently to 
configure it, you set a property like the following.

{noformat}
 
tserver.cache.manager.class=org.apache.accumulo.core.file.blockfile.cache.tinylfu.TinyLfuBlockCacheManager
{noformat}

I would much rather be able to provide a short service name like the following 
when configuring the cache.  However I do not want the list to be predefined, I 
will want the user to be able to provide implementations.

{noformat}
  tserver.cache.implementation=tinylfu
{noformat}

What is a good way to do this? Is there a good reason not do this and just 
stick with class names only?  I was also thinking it may be nice to have a 
shell command for listing services, but this could be done independently.

One way I thought of doing this is having an interface like the the following 
that services (balancer, compaction strategy, cache, etc) could implement.

{code:java}
public interface AccumuloService {
  /**
   * A human readable, short, unique identification that can be specified in 
configuration to identify a service implementation.
   */
  public String getName();

  public static  C load(String configId, Class 
serviceType, ClassLoader classLoader) {
ServiceLoader services = ServiceLoader.load(serviceType, classLoader);

for (C service : services) {
  if(service.getName().equals(configId) || 
service.getClass().getName().equals(configId)) {
return service;
  }
}

return null;
  }
}
{code}

Then the cache implementation could provide a name

{code:java}
//assume BlockCacheManager implements AccumuloService
public class TinyLfuBlockCacheManager extends BlockCacheManager {

  private static final Logger LOG = 
LoggerFactory.getLogger(TinyLfuBlockCacheManager.class);

  @Override
  protected TinyLfuBlockCache createCache(Configuration conf, CacheType type) {
LOG.info("Creating {} cache with configuration {}", type, conf);
return new TinyLfuBlockCache(conf, type);
  }

  @Override
  public String getName() {
return "tinylfu";
  }
}
{code}

The code to load a block cache impl would look like the following :

{code:java}
String impl = conf.get(Property.TSERV_CACHE_MANAGER_IMPL);
BlockCacheManager bcm = AccumuloService.load(impl, BlockCacheManager.class, 
AccumuloVFSClassLoader.getClassLoader());
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ACCUMULO-4836) Tables do not always wait for online or offline

2018-03-01 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4836:
--

 Summary: Tables do not always wait for online or offline
 Key: ACCUMULO-4836
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4836
 Project: Accumulo
  Issue Type: Bug
Affects Versions: 1.7.3
Reporter: Keith Turner
 Fix For: 1.7.4, 1.9.0, 2.0.0


While investigating why TabletStateChangeIteratorIT it was discovered that 
online table with wait=true does not always wait.  The test relied on this API 
to wait and that is why it was failing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4809) Session manager clean up can happen when lock held.

2018-02-28 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4809.

Resolution: Fixed

> Session manager clean up can happen when lock held.
> ---
>
> Key: ACCUMULO-4809
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4809
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.4, 1.9.0, 2.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> While working on [PR #382|https://github.com/apache/accumulo/pull/382] for 
> ACCUMULO-4782 I noticed a significant concurrency bug.  Before #382 their was 
> a single lock for the session manager. The session manager will clean up idle 
> sessions.  This clean up should happen outside the session manager lock, 
> because all tserver read/write operation use the session manger so it should 
> be responsive.
> The bug is the following.
>  * Both getActiveScansPerTable() and getActiveScans() lock the session 
> manager and then lock idleSessions.  See [SessionManager line 
> 233|https://github.com/apache/accumulo/blob/rel/1.7.3/server/tserver/src/main/java/org/apache/accumulo/tserver/session/SessionManager.java#L233]
>  
>  * The sweep() method locks idleSessions and does cleanup while this lock is 
> held. [See SessionManager 
> 200|https://github.com/apache/accumulo/blob/rel/1.7.3/server/tserver/src/main/java/org/apache/accumulo/tserver/session/SessionManager.java#L200]
>  
> Therefore it is possible for getActiveScansPerTable() or getActiveScans() to 
> lock the session manager and then block trying to lock idleSessions while 
> cleanup is happening in sweep().  This will block all access to the session 
> manager while cleanup happens.
> The changes in #382 will fix this for 1.9.0 and 2.0.0.  However I Am not sure 
> about backporting #382 to 1.7.  A more targeted fix could be made for 1.7 or 
> #382 could be backported.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ACCUMULO-4809) Session manager clean up can happen when lock held.

2018-02-28 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner reassigned ACCUMULO-4809:
--

Assignee: Keith Turner

> Session manager clean up can happen when lock held.
> ---
>
> Key: ACCUMULO-4809
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4809
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.7.4, 1.9.0, 2.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> While working on [PR #382|https://github.com/apache/accumulo/pull/382] for 
> ACCUMULO-4782 I noticed a significant concurrency bug.  Before #382 their was 
> a single lock for the session manager. The session manager will clean up idle 
> sessions.  This clean up should happen outside the session manager lock, 
> because all tserver read/write operation use the session manger so it should 
> be responsive.
> The bug is the following.
>  * Both getActiveScansPerTable() and getActiveScans() lock the session 
> manager and then lock idleSessions.  See [SessionManager line 
> 233|https://github.com/apache/accumulo/blob/rel/1.7.3/server/tserver/src/main/java/org/apache/accumulo/tserver/session/SessionManager.java#L233]
>  
>  * The sweep() method locks idleSessions and does cleanup while this lock is 
> held. [See SessionManager 
> 200|https://github.com/apache/accumulo/blob/rel/1.7.3/server/tserver/src/main/java/org/apache/accumulo/tserver/session/SessionManager.java#L200]
>  
> Therefore it is possible for getActiveScansPerTable() or getActiveScans() to 
> lock the session manager and then block trying to lock idleSessions while 
> cleanup is happening in sweep().  This will block all access to the session 
> manager while cleanup happens.
> The changes in #382 will fix this for 1.9.0 and 2.0.0.  However I Am not sure 
> about backporting #382 to 1.7.  A more targeted fix could be made for 1.7 or 
> #382 could be backported.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4832) Seeing warnings when write ahead log changes.

2018-02-28 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380533#comment-16380533
 ] 

Keith Turner commented on ACCUMULO-4832:


I ran continuous ingest for 1 hour on 8 nodes with this change (commit 
3df4acc7f3662d16eb752fdf46cdfd6285f54512) and saw no warning.  I took the 
instance I had setup for 1.7.4-RC0 testing and replaced all of the jars.

> Seeing warnings when write ahead log changes.
> -
>
> Key: ACCUMULO-4832
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4832
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
>Assignee: Ivan Bella
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.7.4, 1.9.0, 2.0.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> While running continuous ingest against 1.7.4-rc0 I saw a lot of warning like 
> the followng.
> {noformat}
> 2018-02-26 17:51:58,189 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:58,724 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:58,940 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:59,226 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:59,227 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:59,227 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> {noformat}
>  
> The warning are generated by [TabletServerLogger.java line 
> 341|https://github.com/apache/accumulo/blob/4e91215f101362ef206e9f213b4d8d12b3f6e0e2/server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java#L341]
>  when a write ahead log is closed.  Write ahead logs are closed as part of 
> normal operations as seen on [TabletServerLogger.java line 
> 386|https://github.com/apache/accumulo/blob/4e91215f101362ef206e9f213b4d8d12b3f6e0e2/server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java#L386].
>   There should not be a warning when this happens.  This is caused by changes 
> made for ACCUMULO-4777.  Before these changes this event was logged at debug. 
>  At this time, these changes have not been released. It would be nice to fix 
> this before releasing 1.7.4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4832) Seeing warnings when write ahead log changes.

2018-02-26 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4832:
---
Priority: Blocker  (was: Major)

> Seeing warnings when write ahead log changes.
> -
>
> Key: ACCUMULO-4832
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4832
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
>Priority: Blocker
> Fix For: 1.7.4, 1.9.0, 2.0.0
>
>
> While running continuous ingest against 1.7.4-rc0 I saw a lot of warning like 
> the followng.
> {noformat}
> 2018-02-26 17:51:58,189 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:58,724 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:58,940 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:59,226 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:59,227 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:59,227 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> {noformat}
>  
> The warning are generated by [TabletServerLogger.java line 
> 341|https://github.com/apache/accumulo/blob/4e91215f101362ef206e9f213b4d8d12b3f6e0e2/server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java#L341]
>  when a write ahead log is closed.  Write ahead logs are closed as part of 
> normal operations as seen on [TabletServerLogger.java line 
> 386|https://github.com/apache/accumulo/blob/4e91215f101362ef206e9f213b4d8d12b3f6e0e2/server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java#L386].
>   There should not be a warning when this happens.  This is caused by changes 
> made for ACCUMULO-4777.  Before these changes this event was logged at debug. 
>  At this time, these changes have not been released. It would be nice to fix 
> this before releasing 1.7.4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4832) Seeing warnings when write ahead log changes.

2018-02-26 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4832:
---
Fix Version/s: 2.0.0
   1.9.0
   1.7.4

> Seeing warnings when write ahead log changes.
> -
>
> Key: ACCUMULO-4832
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4832
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
>Priority: Major
> Fix For: 1.7.4, 1.9.0, 2.0.0
>
>
> While running continuous ingest against 1.7.4-rc0 I saw a lot of warning like 
> the followng.
> {noformat}
> 2018-02-26 17:51:58,189 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:58,724 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:58,940 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:59,226 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:59,227 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> 2018-02-26 17:51:59,227 [log.TabletServerLogger] WARN : Logs closed while 
> writing, retrying attempt 1 (suppressing retry messages for 18ms)
> {noformat}
>  
> The warning are generated by [TabletServerLogger.java line 
> 341|https://github.com/apache/accumulo/blob/4e91215f101362ef206e9f213b4d8d12b3f6e0e2/server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java#L341]
>  when a write ahead log is closed.  Write ahead logs are closed as part of 
> normal operations as seen on [TabletServerLogger.java line 
> 386|https://github.com/apache/accumulo/blob/4e91215f101362ef206e9f213b4d8d12b3f6e0e2/server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java#L386].
>   There should not be a warning when this happens.  This is caused by changes 
> made for ACCUMULO-4777.  Before these changes this event was logged at debug. 
>  At this time, these changes have not been released. It would be nice to fix 
> this before releasing 1.7.4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ACCUMULO-4832) Seeing warnings when write ahead log changes.

2018-02-26 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4832:
--

 Summary: Seeing warnings when write ahead log changes.
 Key: ACCUMULO-4832
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4832
 Project: Accumulo
  Issue Type: Bug
Reporter: Keith Turner


While running continuous ingest against 1.7.4-rc0 I saw a lot of warning like 
the followng.

{noformat}
2018-02-26 17:51:58,189 [log.TabletServerLogger] WARN : Logs closed while 
writing, retrying attempt 1 (suppressing retry messages for 18ms)
2018-02-26 17:51:58,724 [log.TabletServerLogger] WARN : Logs closed while 
writing, retrying attempt 1 (suppressing retry messages for 18ms)
2018-02-26 17:51:58,940 [log.TabletServerLogger] WARN : Logs closed while 
writing, retrying attempt 1 (suppressing retry messages for 18ms)
2018-02-26 17:51:59,226 [log.TabletServerLogger] WARN : Logs closed while 
writing, retrying attempt 1 (suppressing retry messages for 18ms)
2018-02-26 17:51:59,227 [log.TabletServerLogger] WARN : Logs closed while 
writing, retrying attempt 1 (suppressing retry messages for 18ms)
2018-02-26 17:51:59,227 [log.TabletServerLogger] WARN : Logs closed while 
writing, retrying attempt 1 (suppressing retry messages for 18ms)
{noformat}
 
The warning are generated by [TabletServerLogger.java line 
341|https://github.com/apache/accumulo/blob/4e91215f101362ef206e9f213b4d8d12b3f6e0e2/server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java#L341]
 when a write ahead log is closed.  Write ahead logs are closed as part of 
normal operations as seen on [TabletServerLogger.java line 
386|https://github.com/apache/accumulo/blob/4e91215f101362ef206e9f213b4d8d12b3f6e0e2/server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java#L386].
  There should not be a warning when this happens.  This is caused by changes 
made for ACCUMULO-4777.  Before these changes this event was logged at debug.  
At this time, these changes have not been released. It would be nice to fix 
this before releasing 1.7.4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4805) Seeing thread contention on FileManager

2018-02-22 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4805.

Resolution: Fixed

I did not remove the lock contention in the two PRs, I only made it less 
painful.  The global tserver lock is still there.  I did the following.
 *  made the code that executes when the lock is held faster
 * made a scan operation acquire the lock much fewer times
 * Changed semaphore from fair to non-fair, as the fair semaphore was very slow

I think it may be best to remove the file manager completely as described in 
ACCUMULO-543.  The only reservation is I have is continually deserializing the 
rfile metadata.

> Seeing thread contention on FileManager
> ---
>
> Key: ACCUMULO-4805
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4805
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Accumulo has a tablet server wide cache of open files.  Accessing this cache 
> obtains a global lock.  In profiling, I am seeing contention on this lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4823) Run autorefactor

2018-02-22 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16373576#comment-16373576
 ] 

Keith Turner commented on ACCUMULO-4823:


When experimenting with this I actually found one instance where the 
transformation it did was wrong for code with lambda.

> Run autorefactor
> 
>
> Key: ACCUMULO-4823
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4823
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> Could look into running this tool on Accumulo source code.
>  
> http://autorefactor.org/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ACCUMULO-4823) Run autorefactor

2018-02-22 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4823:
--

 Summary: Run autorefactor
 Key: ACCUMULO-4823
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4823
 Project: Accumulo
  Issue Type: Improvement
Reporter: Keith Turner
 Fix For: 2.0.0


Could look into running this tool on Accumulo source code.

 

http://autorefactor.org/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ACCUMULO-4822) Remove observers from configuration

2018-02-21 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4822:
--

 Summary: Remove observers from configuration
 Key: ACCUMULO-4822
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4822
 Project: Accumulo
  Issue Type: Improvement
Reporter: Keith Turner


As part of the work done for ACCUMULO-4779 the method getUpdateCount was added 
to AccumuloConfiguration.  Subclasses of AccumuloConfiguration increase a 
counter each time it changes and this counter is available via getUpdateCount.  
Anything that is derived from configuration can be cached and this counter can 
be used to know when it needs to be derived again.

Some AccumuloConfigurations are also observable.  I think this observer pattern 
could be dropped in favor of using this counter.  I think this change would 
simplify the code.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4788) Improve Thrift Transport pool

2018-02-20 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4788.

Resolution: Fixed

> Improve Thrift Transport pool
> -
>
> Key: ACCUMULO-4788
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4788
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Accumulo has a pool of recently opened connections to tablet servers.  When 
> connecting to tablet servers, this pool is checked first. The pool is built 
> around a map of list.  There are two problems with this pool :
>  * It has global lock around the map of list
>  * When trying to find a connection it does a linear search for a non 
> reserved connection (this is per tablet server)
> Could possibly move to a model of having a list of unreserved connections and 
> a set of reserved connections per tablet server. Then to get a connection, 
> could remove from the unreserved list and add to the reserved set.  This 
> would be a constant time operation.
> For the locking, could move to a model of using a concurrent map and locking 
> per tserver instead of locking the entire map.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ACCUMULO-4821) Make thrift transport pool more concurrent

2018-02-20 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4821:
--

 Summary: Make thrift transport pool more concurrent
 Key: ACCUMULO-4821
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4821
 Project: Accumulo
  Issue Type: Improvement
Reporter: Keith Turner


The thrift transport pool has a global lock, it would be nice to remove this 
and replace it with something more concurrent.  See ACCUMULO-4788



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4799) In tablet server start scan authenticates twice

2018-02-20 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4799.

Resolution: Fixed
  Assignee: Keith Turner

> In tablet server start scan authenticates twice
> ---
>
> Key: ACCUMULO-4799
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4799
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The code that handles a start scan RPC call checks authentication twice.  
> Each call to authenticate takes a bit of time.  It would be nice if it only 
> did it once.
> At [TabletServer line 
> 479|https://github.com/apache/accumulo/blob/rel/1.8.1/server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java#L479]
>  a call to canScan is made which calls authenticate.  Then at [TabletServer 
> line 
> 482|https://github.com/apache/accumulo/blob/rel/1.8.1/server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java#L482]
>  a call to check authorizations is made which also authenticates.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4800) Preparse iterator configuration

2018-02-20 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4800.

Resolution: Fixed

> Preparse iterator configuration
> ---
>
> Key: ACCUMULO-4800
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4800
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> I am noticing that for small scans a good bit of time is spent parsing 
> iterator config.  It would be nice to pre-parse iterator config and only 
> reparse when table config changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4813) Accepting mapping file for bulk import

2018-02-16 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366690#comment-16366690
 ] 

Keith Turner commented on ACCUMULO-4813:


[~m-hogue] it would be additive. I think it would be nice to deprecate the 
current bulk import process in favor of this.  This could new done by adding 
new APIs to do bulk import with a mapping file and deprecating the current APIs.

> Accepting mapping file for bulk import
> --
>
> Key: ACCUMULO-4813
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4813
> Project: Accumulo
>  Issue Type: Sub-task
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> During bulk import, inspecting files to determine where they go is expensive 
> and slow.  In order to spread the cost, Accumulo has an internal mechanism to 
> spread the work of inspecting files to random tablet servers.  Because this 
> internal process takes time and consumes resources on the cluster, users want 
> control over it.  The best way to give this control may be to externalize it 
> by allowing bulk imports to have a mapping file.  This mapping file would 
> specify the ranges where files should be loaded.  If Accumulo provided API to 
> help produce this file, then that work could be done in Map Reduce or Spark.  
> This would give users all the control they want over when and where this 
> computation is done.  This would naturally fit in the process used to create 
> the bulk files. 
> To make bulk import fast this mapping file should have the following 
> properties.
>  * Key in file is a range
>  * Value in file is a list of files
>  * Ranges are non overlapping
>  * File is sorted by range/key
>  * Has a mapping for every non-empty file in the bulk import directory.
> If Accumulo provides APIs to do the following operation, then producing the 
> file could written as a map/reduce job.
>  * For a given rfile produce a list of row ranges where the file should be 
> loaded.  These row ranges would be based on tablets.
>  * Merge row range,list of file pairs
>  * Serialize row range,list of files pairs
> With a mapping file, the bulk import algorithm could be written as follows.  
> This could all be executed in the master with no need to run inspection task 
> on random tablet servers.
>  * Sanity check file
>  ** Ensure in sorted order
>  ** Ensure ranges are non-overlapping
>  ** Ensure each file in directory has at least one entry in file
>  ** Ensure all splits in the file exist in the table.
>  * Since file is sorted can do a merged read of file and metadata table, 
> looping over the following operations for each tablet until all files are 
> loaded.
>  ** Read the loaded files for the tablet
>  ** Read the files to load for the range
>  ** For any files not loaded, send an async load message to the tablet server
> The above algorithm can just keep scanning the metadata table and sending 
> async load messages until the bulk import is complete.  Since the load 
> messages are async, the bulk load could of a large number of files could 
> potentially be very fast.
> The bulk load operation can easily handle the case of tablets splitting 
> during the operation by matching a single range in the file to multiple 
> tablets.  However attempting to handle merges would be a lot more tricky.  It 
> would probably be simplest to fail the operation if a merge is detected.  The 
> nice thing is that this can be done in a very clean way.   Once the bulk 
> import operation has the table lock, merges can not happen.  So after getting 
> the table lock the bulk import operation can ensure all splits in the file 
> exist in the table. The operation can abort if the condition is not met 
> before doing any work.  If this condition is not met, it indicates a merge 
> happened between generating the mapping file an doing the bulk import.
> Hopefully the mapping file plus the algorithm that sends async load messages 
> can dramatically speed up bulk import operations.  This may lessen the need 
> for other things like prioritizing bulk import.  To measure this, it would be 
> very useful create a bulk import performance test that can create many files 
> with very little data and measure the time it takes load them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4641) Modify BlockCache interface to avoid race conditions

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4641.

Resolution: Fixed

> Modify BlockCache interface to avoid race conditions
> 
>
> Key: ACCUMULO-4641
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4641
> Project: Accumulo
>  Issue Type: Sub-task
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Currently the BlockCache interface has functions to get and put.  Accumulo 
> will try to get a block, if it does not exist load it, and then put it in the 
> cache.  This can lead to race conditions where multiple threads unnecessarily 
> load the same block.
> I think it would be better to modify the block cache interface to only have a 
> function like the following.  
> {code:java}
>   CacheEntry get(String blockName, BlockLoader loader)
> {code} 
> BlockLoader represents a function that the cache can call if a block is not 
> present.  The cache implementation can attempt to handle load race conditions 
> however it likes..



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4788) Improve Thrift Transport pool

2018-02-15 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365926#comment-16365926
 ] 

Keith Turner commented on ACCUMULO-4788:


I need to open a follow up issue before closing this out.

> Improve Thrift Transport pool
> -
>
> Key: ACCUMULO-4788
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4788
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Accumulo has a pool of recently opened connections to tablet servers.  When 
> connecting to tablet servers, this pool is checked first. The pool is built 
> around a map of list.  There are two problems with this pool :
>  * It has global lock around the map of list
>  * When trying to find a connection it does a linear search for a non 
> reserved connection (this is per tablet server)
> Could possibly move to a model of having a list of unreserved connections and 
> a set of reserved connections per tablet server. Then to get a connection, 
> could remove from the unreserved list and add to the reserved set.  This 
> would be a constant time operation.
> For the locking, could move to a model of using a concurrent map and locking 
> per tserver instead of locking the entire map.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4805) Seeing thread contention on FileManager

2018-02-15 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365918#comment-16365918
 ] 

Keith Turner commented on ACCUMULO-4805:


I still want to create a follow on issue about lock contention and do a few 
things to do less work while the lock is held/

> Seeing thread contention on FileManager
> ---
>
> Key: ACCUMULO-4805
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4805
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Accumulo has a tablet server wide cache of open files.  Accessing this cache 
> obtains a global lock.  In profiling, I am seeing contention on this lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ACCUMULO-4805) Seeing thread contention on FileManager

2018-02-15 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357765#comment-16357765
 ] 

Keith Turner edited comment on ACCUMULO-4805 at 2/15/18 4:55 PM:
-

The changes I made in #380 do not completely remove the contention, they just 
lessened the amount of time spent doing work while the lock is still held.


was (Author: kturner):
The changes I made in #380 do not completely remove the contention, they just 
lessened the amount of content. 

> Seeing thread contention on FileManager
> ---
>
> Key: ACCUMULO-4805
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4805
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Accumulo has a tablet server wide cache of open files.  Accessing this cache 
> obtains a global lock.  In profiling, I am seeing contention on this lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4788) Improve Thrift Transport pool

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4788:
---
Fix Version/s: 1.9.0

> Improve Thrift Transport pool
> -
>
> Key: ACCUMULO-4788
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4788
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Accumulo has a pool of recently opened connections to tablet servers.  When 
> connecting to tablet servers, this pool is checked first. The pool is built 
> around a map of list.  There are two problems with this pool :
>  * It has global lock around the map of list
>  * When trying to find a connection it does a linear search for a non 
> reserved connection (this is per tablet server)
> Could possibly move to a model of having a list of unreserved connections and 
> a set of reserved connections per tablet server. Then to get a connection, 
> could remove from the unreserved list and add to the reserved set.  This 
> would be a constant time operation.
> For the locking, could move to a model of using a concurrent map and locking 
> per tserver instead of locking the entire map.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4801) Consider precomputing some client context fields

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4801.

Resolution: Fixed

> Consider precomputing some client context fields
> 
>
> Key: ACCUMULO-4801
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4801
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently each time a connection is requested from the the thrift transport 
> pool, three methods are called on client context to get ssl, sasl, and 
> timeout.  These in turn call methods on configuration.  This is showing up in 
> profiling as slow.  I wonder if these could be precomputed in the client 
> context constructor.
>  
> Also, repeatedly calling rpcCreds() on client context is showing up as slow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4801) Consider precomputing some client context fields

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4801:
---
Fix Version/s: 1.9.0

> Consider precomputing some client context fields
> 
>
> Key: ACCUMULO-4801
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4801
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently each time a connection is requested from the the thrift transport 
> pool, three methods are called on client context to get ssl, sasl, and 
> timeout.  These in turn call methods on configuration.  This is showing up in 
> profiling as slow.  I wonder if these could be precomputed in the client 
> context constructor.
>  
> Also, repeatedly calling rpcCreds() on client context is showing up as slow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4789) Scans spend significant time constructing debug string.

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4789:
---
Affects Version/s: 1.7.3
   1.8.1

> Scans spend significant time constructing debug string.
> ---
>
> Key: ACCUMULO-4789
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4789
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While profiling a Fluo test running lots of little scans, I noticed a string 
> builder operation showing up prominently in the profiling results.  Below is 
> a link to the problematic code.  Calling range toString was the most 
> expensive part followed by KeyExtent toString.
> [https://github.com/apache/accumulo/blob/rel/1.7.3/core/src/main/java/org/apache/accumulo/core/client/impl/ThriftScanner.java#L405]
>  
> I am not sure if we can change this in 1.7 and 1.8/1.9 because people may 
> rely on this for debugging.  In 2.0 we may want to consider removing this (or 
> moving it inside the logging code block).
> Also, while looking at this I noticed that some of the log statements called 
> String.format.  Those should be placed in a if(llog.traceEnabled()) block.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4789) Scans spend significant time constructing debug string.

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4789:
---
Fix Version/s: 2.0.0
   1.9.0

> Scans spend significant time constructing debug string.
> ---
>
> Key: ACCUMULO-4789
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4789
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While profiling a Fluo test running lots of little scans, I noticed a string 
> builder operation showing up prominently in the profiling results.  Below is 
> a link to the problematic code.  Calling range toString was the most 
> expensive part followed by KeyExtent toString.
> [https://github.com/apache/accumulo/blob/rel/1.7.3/core/src/main/java/org/apache/accumulo/core/client/impl/ThriftScanner.java#L405]
>  
> I am not sure if we can change this in 1.7 and 1.8/1.9 because people may 
> rely on this for debugging.  In 2.0 we may want to consider removing this (or 
> moving it inside the logging code block).
> Also, while looking at this I noticed that some of the log statements called 
> String.format.  Those should be placed in a if(llog.traceEnabled()) block.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ACCUMULO-4801) Consider precomputing some client context fields

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner reassigned ACCUMULO-4801:
--

Assignee: Keith Turner

> Consider precomputing some client context fields
> 
>
> Key: ACCUMULO-4801
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4801
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently each time a connection is requested from the the thrift transport 
> pool, three methods are called on client context to get ssl, sasl, and 
> timeout.  These in turn call methods on configuration.  This is showing up in 
> profiling as slow.  I wonder if these could be precomputed in the client 
> context constructor.
>  
> Also, repeatedly calling rpcCreds() on client context is showing up as slow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ACCUMULO-4789) Scans spend significant time constructing debug string.

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner reassigned ACCUMULO-4789:
--

Assignee: Keith Turner

> Scans spend significant time constructing debug string.
> ---
>
> Key: ACCUMULO-4789
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4789
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While profiling a Fluo test running lots of little scans, I noticed a string 
> builder operation showing up prominently in the profiling results.  Below is 
> a link to the problematic code.  Calling range toString was the most 
> expensive part followed by KeyExtent toString.
> [https://github.com/apache/accumulo/blob/rel/1.7.3/core/src/main/java/org/apache/accumulo/core/client/impl/ThriftScanner.java#L405]
>  
> I am not sure if we can change this in 1.7 and 1.8/1.9 because people may 
> rely on this for debugging.  In 2.0 we may want to consider removing this (or 
> moving it inside the logging code block).
> Also, while looking at this I noticed that some of the log statements called 
> String.format.  Those should be placed in a if(llog.traceEnabled()) block.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4789) Scans spend significant time constructing debug string.

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4789.

Resolution: Fixed

> Scans spend significant time constructing debug string.
> ---
>
> Key: ACCUMULO-4789
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4789
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While profiling a Fluo test running lots of little scans, I noticed a string 
> builder operation showing up prominently in the profiling results.  Below is 
> a link to the problematic code.  Calling range toString was the most 
> expensive part followed by KeyExtent toString.
> [https://github.com/apache/accumulo/blob/rel/1.7.3/core/src/main/java/org/apache/accumulo/core/client/impl/ThriftScanner.java#L405]
>  
> I am not sure if we can change this in 1.7 and 1.8/1.9 because people may 
> rely on this for debugging.  In 2.0 we may want to consider removing this (or 
> moving it inside the logging code block).
> Also, while looking at this I noticed that some of the log statements called 
> String.format.  Those should be placed in a if(llog.traceEnabled()) block.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4788) Improve Thrift Transport pool

2018-02-15 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365802#comment-16365802
 ] 

Keith Turner commented on ACCUMULO-4788:


In PR #385 the global lock was not addressed.  The only change it made was to 
make the operations done when the lock was held much faster.

> Improve Thrift Transport pool
> -
>
> Key: ACCUMULO-4788
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4788
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Accumulo has a pool of recently opened connections to tablet servers.  When 
> connecting to tablet servers, this pool is checked first. The pool is built 
> around a map of list.  There are two problems with this pool :
>  * It has global lock around the map of list
>  * When trying to find a connection it does a linear search for a non 
> reserved connection (this is per tablet server)
> Could possibly move to a model of having a list of unreserved connections and 
> a set of reserved connections per tablet server. Then to get a connection, 
> could remove from the unreserved list and add to the reserved set.  This 
> would be a constant time operation.
> For the locking, could move to a model of using a concurrent map and locking 
> per tserver instead of locking the entire map.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4782) With many threads scanning seeing lock contention on SessionManager

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4782:
---
Fix Version/s: 2.0.0
   1.9.0

> With many threads scanning seeing lock contention on SessionManager
> ---
>
> Key: ACCUMULO-4782
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4782
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While profiling many threads doing small scans against accumulo, lock 
> contention on the tablet servers SessionManager was high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ACCUMULO-4782) With many threads scanning seeing lock contention on SessionManager

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner reassigned ACCUMULO-4782:
--

Assignee: Keith Turner

> With many threads scanning seeing lock contention on SessionManager
> ---
>
> Key: ACCUMULO-4782
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4782
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While profiling many threads doing small scans against accumulo, lock 
> contention on the tablet servers SessionManager was high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4782) With many threads scanning seeing lock contention on SessionManager

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4782.

Resolution: Fixed

> With many threads scanning seeing lock contention on SessionManager
> ---
>
> Key: ACCUMULO-4782
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4782
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While profiling many threads doing small scans against accumulo, lock 
> contention on the tablet servers SessionManager was high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ACCUMULO-4798) Copying Stat in ZooCache is slow

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner reassigned ACCUMULO-4798:
--

Assignee: Keith Turner

> Copying Stat in ZooCache is slow
> 
>
> Key: ACCUMULO-4798
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4798
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The ZooKeeper cache code caches Zookeeper stats.  When a stat is requested 
> from the cache it copies it.  The ZK stat class offers no good way to copy 
> other than serialize and deserialize.  The code currently does this and its 
> slow.  All code in Accumulo only uses one field from stat, so it would be 
> much better to create a simple class that has this one field and can quickly 
> copy.  
>  
> The stat is used very frequently in the metadata cache code to check if a 
> tserver still holds its lock.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4798) Copying Stat in ZooCache is slow

2018-02-15 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4798.

Resolution: Fixed

> Copying Stat in ZooCache is slow
> 
>
> Key: ACCUMULO-4798
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4798
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.9.0, 2.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The ZooKeeper cache code caches Zookeeper stats.  When a stat is requested 
> from the cache it copies it.  The ZK stat class offers no good way to copy 
> other than serialize and deserialize.  The code currently does this and its 
> slow.  All code in Accumulo only uses one field from stat, so it would be 
> much better to create a simple class that has this one field and can quickly 
> copy.  
>  
> The stat is used very frequently in the metadata cache code to check if a 
> tserver still holds its lock.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4806) Allow offline bulk imports

2018-02-14 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364493#comment-16364493
 ] 

Keith Turner commented on ACCUMULO-4806:


For the possible workflow I mentioned I was thinking the offline bulk import 
could use the mapping file mentioned in ACCUMULO-4813. I think this could make 
that entire sequence of operations very fast.

> Allow offline bulk imports
> --
>
> Key: ACCUMULO-4806
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4806
> Project: Accumulo
>  Issue Type: Sub-task
>  Components: master, tserver
>Reporter: Mark Owens
>Assignee: Michael Miller
>Priority: Major
> Fix For: 2.0.0
>
>
> Allowing offline bulk imports would be useful for some customers. Currently 
> these customers already take tables offline to set split points but then have 
> to bring them back online before starting the import.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ACCUMULO-4806) Allow offline bulk imports

2018-02-14 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364471#comment-16364471
 ] 

Keith Turner edited comment on ACCUMULO-4806 at 2/14/18 5:31 PM:
-

[~etcoleman] if create table supported creating an offline table, would the 
following work flow be useful?
 * Create offline table
 * Add splits to offline table
 * Bulk import to offline table
 * Bring table online


was (Author: kturner):
[~etcoleman] if create table supported creating and offline table, would the 
following work flow be useful?
 * Create offline table
 * Add splits to offline table
 * Bulk import to offline table
 * Bring table online

> Allow offline bulk imports
> --
>
> Key: ACCUMULO-4806
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4806
> Project: Accumulo
>  Issue Type: Sub-task
>  Components: master, tserver
>Reporter: Mark Owens
>Assignee: Michael Miller
>Priority: Major
> Fix For: 2.0.0
>
>
> Allowing offline bulk imports would be useful for some customers. Currently 
> these customers already take tables offline to set split points but then have 
> to bring them back online before starting the import.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4808) Add splits to table at table creation.

2018-02-14 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364472#comment-16364472
 ] 

Keith Turner commented on ACCUMULO-4808:


Another possible way to offer this functionality is to allow creating tables in 
offline mode and allow splitting offline tables.

> Add splits to table at table creation.
> --
>
> Key: ACCUMULO-4808
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4808
> Project: Accumulo
>  Issue Type: New Feature
>  Components: master, tserver
>Reporter: Mark Owens
>Assignee: Mark Owens
>Priority: Major
> Fix For: 2.0.0
>
>
> Add capability to add table splits at table creation. Recent changes now 
> allow iterator and locality groups to be created at table creation. Do the 
> same with splits. Comment below from 
> [ACCUMULO-4806|https://issues.apache.org/jira/browse/ACCUMULO-4806] explains 
> the motivation for the request:
> {quote}[~etcoleman] added a comment - 2 hours ago
> It would go al long way if the splits could be added at table creation or 
> when table is offline.  When the other API changes were made by Mark, I 
> wondered if this task could also could be done at that time - but I believe 
> that it was more complicated.
> The delay is that when a table is created and then the splits added and then 
> taken offline there is a period proportional to the number of splits as they 
> are off-loaded from the tserver where they originally got assigned.  (The 
> re-online with splits distributed across the cluster is quite fast)
> If the splits could be added at table creation, or while the table is offline 
> so that the delay for shedding the tablets could be avoided, then the need to 
> perform the actual import offline would not be as necessary.
>  
> {quote}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4806) Allow offline bulk imports

2018-02-14 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364471#comment-16364471
 ] 

Keith Turner commented on ACCUMULO-4806:


[~etcoleman] if create table supported creating and offline table, would the 
following work flow be useful?
 * Create offline table
 * Add splits to offline table
 * Bulk import to offline table
 * Bring table online

> Allow offline bulk imports
> --
>
> Key: ACCUMULO-4806
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4806
> Project: Accumulo
>  Issue Type: Sub-task
>  Components: master, tserver
>Reporter: Mark Owens
>Assignee: Michael Miller
>Priority: Major
> Fix For: 2.0.0
>
>
> Allowing offline bulk imports would be useful for some customers. Currently 
> these customers already take tables offline to set split points but then have 
> to bring them back online before starting the import.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ACCUMULO-4788) Improve Thrift Transport pool

2018-02-13 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner reassigned ACCUMULO-4788:
--

Assignee: Keith Turner

> Improve Thrift Transport pool
> -
>
> Key: ACCUMULO-4788
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4788
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> Accumulo has a pool of recently opened connections to tablet servers.  When 
> connecting to tablet servers, this pool is checked first. The pool is built 
> around a map of list.  There are two problems with this pool :
>  * It has global lock around the map of list
>  * When trying to find a connection it does a linear search for a non 
> reserved connection (this is per tablet server)
> Could possibly move to a model of having a list of unreserved connections and 
> a set of reserved connections per tablet server. Then to get a connection, 
> could remove from the unreserved list and add to the reserved set.  This 
> would be a constant time operation.
> For the locking, could move to a model of using a concurrent map and locking 
> per tserver instead of locking the entire map.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ACCUMULO-4709) Add size sanity checks to Mutations

2018-02-12 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner resolved ACCUMULO-4709.

Resolution: Fixed

> Add size sanity checks to Mutations
> ---
>
> Key: ACCUMULO-4709
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4709
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Gergely Hajós
>Priority: Major
>  Labels: newbie, pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Based in ACCUMULO-4708, it may be good to add size sanity checks to 
> Accumulo's Mutation data type.  The first step would be to determine how 
> Mutation handles the following situations currently.
>  * Create a mutation and put lots of small entries where total size exceeds 
> 2GB
>  * Create a mutation and add a single entry where the total of all fields 
> exceeds 2GB, but no individual field exceeds 2GB



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4813) Accepting mapping file for bulk import

2018-02-12 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4813:
---
Description: 
During bulk import, inspecting files to determine where they go is expensive 
and slow.  In order to spread the cost, Accumulo has an internal mechanism to 
spread the work of inspecting files to random tablet servers.  Because this 
internal process takes time and consumes resources on the cluster, users want 
control over it.  The best way to give this control may be to externalize it by 
allowing bulk imports to have a mapping file.  This mapping file would specify 
the ranges where files should be loaded.  If Accumulo provided API to help 
produce this file, then that work could be done in Map Reduce or Spark.  This 
would give users all the control they want over when and where this computation 
is done.  This would naturally fit in the process used to create the bulk 
files. 

To make bulk import fast this mapping file should have the following properties.
 * Key in file is a range
 * Value in file is a list of files
 * Ranges are non overlapping
 * File is sorted by range/key
 * Has a mapping for every non-empty file in the bulk import directory.

If Accumulo provides APIs to do the following operation, then producing the 
file could written as a map/reduce job.
 * For a given rfile produce a list of row ranges where the file should be 
loaded.  These row ranges would be based on tablets.
 * Merge row range,list of file pairs
 * Serialize row range,list of files pairs

With a mapping file, the bulk import algorithm could be written as follows.  
This could all be executed in the master with no need to run inspection task on 
random tablet servers.
 * Sanity check file
 ** Ensure in sorted order
 ** Ensure ranges are non-overlapping
 ** Ensure each file in directory has at least one entry in file
 ** Ensure all splits in the file exist in the table.
 * Since file is sorted can do a merged read of file and metadata table, 
looping over the following operations for each tablet until all files are 
loaded.
 ** Read the loaded files for the tablet
 ** Read the files to load for the range
 ** For any files not loaded, send an async load message to the tablet server

The above algorithm can just keep scanning the metadata table and sending async 
load messages until the bulk import is complete.  Since the load messages are 
async, the bulk load could of a large number of files could potentially be very 
fast.

The bulk load operation can easily handle the case of tablets splitting during 
the operation by matching a single range in the file to multiple tablets.  
However attempting to handle merges would be a lot more tricky.  It would 
probably be simplest to fail the operation if a merge is detected.  The nice 
thing is that this can be done in a very clean way.   Once the bulk import 
operation has the table lock, merges can not happen.  So after getting the 
table lock the bulk import operation can ensure all splits in the file exist in 
the table. The operation can abort if the condition is not met before doing any 
work.  If this condition is not met, it indicates a merge happened between 
generating the mapping file an doing the bulk import.

Hopefully the mapping file plus the algorithm that sends async load messages 
can dramatically speed up bulk import operations.  This may lessen the need for 
other things like prioritizing bulk import.  To measure this, it would be very 
useful create a bulk import performance test that can create many files with 
very little data and measure the time it takes load them.

  was:
During bulk import, inspecting files to determine where they go is expensive 
and slow.  In order to spread the cost, Accumulo has an internal mechanism to 
spread the work of inspecting files to random tablet servers.  Because this 
internal process takes time and consumes resources on the cluster, users want 
control over it.  The best way to give this control may be to externalize it by 
allowing bulk imports to have a mapping file.  This mapping file would specify 
the ranges where files should be loaded.  If Accumulo provided API to help 
produce this file, then that work could be done in Map Reduce or Spark.  This 
would give users all the control they want over when and where this computation 
is done.  This would naturally fit in the process used to create the bulk 
files. 

To make bulk import fast this mapping file should have the following properties.
 * Key in file is a range
 * Value in file is a list of files
 * Ranges are non overlapping
 * File is sorted by range/key
 * Has a mapping for every non-empty file in the bulk import directory.

If Accumulo provides APIs to do the following operation, then producing the 
file could written as a map/reduce job.
 * For a given file produce a list of ranges
 * Merge range,list of file pairs
 * Serialize 

[jira] [Commented] (ACCUMULO-4813) Accepting mapping file for bulk import

2018-02-12 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361089#comment-16361089
 ] 

Keith Turner commented on ACCUMULO-4813:


Conceptually the contents of this special mapping file would something like.
{code:java}
  SortedMap
{code}
The file would need a special file extension, not sure what it would be.  Maybe 
.lm for load mapping?

> Accepting mapping file for bulk import
> --
>
> Key: ACCUMULO-4813
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4813
> Project: Accumulo
>  Issue Type: Sub-task
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> During bulk import, inspecting files to determine where they go is expensive 
> and slow.  In order to spread the cost, Accumulo has an internal mechanism to 
> spread the work of inspecting files to random tablet servers.  Because this 
> internal process takes time and consumes resources on the cluster, users want 
> control over it.  The best way to give this control may be to externalize it 
> by allowing bulk imports to have a mapping file.  This mapping file would 
> specify the ranges where files should be loaded.  If Accumulo provided API to 
> help produce this file, then that work could be done in Map Reduce or Spark.  
> This would give users all the control they want over when and where this 
> computation is done.  This would naturally fit in the process used to create 
> the bulk files. 
> To make bulk import fast this mapping file should have the following 
> properties.
>  * Key in file is a range
>  * Value in file is a list of files
>  * Ranges are non overlapping
>  * File is sorted by range/key
>  * Has a mapping for every non-empty file in the bulk import directory.
> If Accumulo provides APIs to do the following operation, then producing the 
> file could written as a map/reduce job.
>  * For a given file produce a list of ranges
>  * Merge range,list of file pairs
>  * Serialize range,list of files pairs
> With a mapping file, the bulk import algorithm could be written as follows.  
> This could all be executed in the master with no need to run inspection task 
> on random tablet servers.
>  * Sanity check file
>  ** Ensure in sorted order
>  ** Ensure ranges are non-overlapping
>  ** Ensure each file in directory has at least one entry in file
>  ** Ensure all splits in the file exist in the table.
>  * Since file is sorted can do a merged read of file and metadata table, 
> looping over the following operations for each tablet until all files are 
> loaded.
>  ** Read the loaded files for the tablet
>  ** Read the files to load for the range
>  ** For any files not loaded, send an async load message to the tablet server
> The above algorithm can just keep scanning the metadata table and sending 
> async load messages until the bulk import is complete.  Since the load 
> messages are async, the bulk load could of a large number of files could 
> potentially be very fast.
> The bulk load operation can easily handle the case of tablets splitting 
> during the operation by matching a single range in the file to multiple 
> tablets.  However attempting to handle merges would be a lot more tricky.  It 
> would probably be simplest to fail the operation if a merge is detected.  The 
> nice thing is that this can be done in a very clean way.   Once the bulk 
> import operation has the table lock, merges can not happen.  So after getting 
> the table lock the bulk import operation can ensure all splits in the file 
> exist in the table. The operation can abort if the condition is not met 
> before doing any work.  If this condition is not met, it indicates a merge 
> happened between generating the mapping file an doing the bulk import.
> Hopefully the mapping file plus the algorithm that sends async load messages 
> can dramatically speed up bulk import operations.  This may lessen the need 
> for other things like prioritizing bulk import.  To measure this, it would be 
> very useful create a bulk import performance test that can create many files 
> with very little data and measure the time it takes load them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ACCUMULO-4813) Accepting mapping file for bulk import

2018-02-12 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361089#comment-16361089
 ] 

Keith Turner edited comment on ACCUMULO-4813 at 2/12/18 5:06 PM:
-

Conceptually the contents of this special mapping file would be something like.
{code:java}
  SortedMap
{code}
The file would need a special file extension, not sure what it would be.  Maybe 
.lm for load mapping?


was (Author: kturner):
Conceptually the contents of this special mapping file would something like.
{code:java}
  SortedMap
{code}
The file would need a special file extension, not sure what it would be.  Maybe 
.lm for load mapping?

> Accepting mapping file for bulk import
> --
>
> Key: ACCUMULO-4813
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4813
> Project: Accumulo
>  Issue Type: Sub-task
>Reporter: Keith Turner
>Priority: Major
> Fix For: 2.0.0
>
>
> During bulk import, inspecting files to determine where they go is expensive 
> and slow.  In order to spread the cost, Accumulo has an internal mechanism to 
> spread the work of inspecting files to random tablet servers.  Because this 
> internal process takes time and consumes resources on the cluster, users want 
> control over it.  The best way to give this control may be to externalize it 
> by allowing bulk imports to have a mapping file.  This mapping file would 
> specify the ranges where files should be loaded.  If Accumulo provided API to 
> help produce this file, then that work could be done in Map Reduce or Spark.  
> This would give users all the control they want over when and where this 
> computation is done.  This would naturally fit in the process used to create 
> the bulk files. 
> To make bulk import fast this mapping file should have the following 
> properties.
>  * Key in file is a range
>  * Value in file is a list of files
>  * Ranges are non overlapping
>  * File is sorted by range/key
>  * Has a mapping for every non-empty file in the bulk import directory.
> If Accumulo provides APIs to do the following operation, then producing the 
> file could written as a map/reduce job.
>  * For a given file produce a list of ranges
>  * Merge range,list of file pairs
>  * Serialize range,list of files pairs
> With a mapping file, the bulk import algorithm could be written as follows.  
> This could all be executed in the master with no need to run inspection task 
> on random tablet servers.
>  * Sanity check file
>  ** Ensure in sorted order
>  ** Ensure ranges are non-overlapping
>  ** Ensure each file in directory has at least one entry in file
>  ** Ensure all splits in the file exist in the table.
>  * Since file is sorted can do a merged read of file and metadata table, 
> looping over the following operations for each tablet until all files are 
> loaded.
>  ** Read the loaded files for the tablet
>  ** Read the files to load for the range
>  ** For any files not loaded, send an async load message to the tablet server
> The above algorithm can just keep scanning the metadata table and sending 
> async load messages until the bulk import is complete.  Since the load 
> messages are async, the bulk load could of a large number of files could 
> potentially be very fast.
> The bulk load operation can easily handle the case of tablets splitting 
> during the operation by matching a single range in the file to multiple 
> tablets.  However attempting to handle merges would be a lot more tricky.  It 
> would probably be simplest to fail the operation if a merge is detected.  The 
> nice thing is that this can be done in a very clean way.   Once the bulk 
> import operation has the table lock, merges can not happen.  So after getting 
> the table lock the bulk import operation can ensure all splits in the file 
> exist in the table. The operation can abort if the condition is not met 
> before doing any work.  If this condition is not met, it indicates a merge 
> happened between generating the mapping file an doing the bulk import.
> Hopefully the mapping file plus the algorithm that sends async load messages 
> can dramatically speed up bulk import operations.  This may lessen the need 
> for other things like prioritizing bulk import.  To measure this, it would be 
> very useful create a bulk import performance test that can create many files 
> with very little data and measure the time it takes load them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ACCUMULO-4813) Accepting mapping file for bulk import

2018-02-12 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4813:
--

 Summary: Accepting mapping file for bulk import
 Key: ACCUMULO-4813
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4813
 Project: Accumulo
  Issue Type: Sub-task
Reporter: Keith Turner
 Fix For: 2.0.0


During bulk import, inspecting files to determine where they go is expensive 
and slow.  In order to spread the cost, Accumulo has an internal mechanism to 
spread the work of inspecting files to random tablet servers.  Because this 
internal process takes time and consumes resources on the cluster, users want 
control over it.  The best way to give this control may be to externalize it by 
allowing bulk imports to have a mapping file.  This mapping file would specify 
the ranges where files should be loaded.  If Accumulo provided API to help 
produce this file, then that work could be done in Map Reduce or Spark.  This 
would give users all the control they want over when and where this computation 
is done.  This would naturally fit in the process used to create the bulk 
files. 

To make bulk import fast this mapping file should have the following properties.
 * Key in file is a range
 * Value in file is a list of files
 * Ranges are non overlapping
 * File is sorted by range/key
 * Has a mapping for every non-empty file in the bulk import directory.

If Accumulo provides APIs to do the following operation, then producing the 
file could written as a map/reduce job.
 * For a given file produce a list of ranges
 * Merge range,list of file pairs
 * Serialize range,list of files pairs

With a mapping file, the bulk import algorithm could be written as follows.  
This could all be executed in the master with no need to run inspection task on 
random tablet servers.
 * Sanity check file
 ** Ensure in sorted order
 ** Ensure ranges are non-overlapping
 ** Ensure each file in directory has at least one entry in file
 ** Ensure all splits in the file exist in the table.
 * Since file is sorted can do a merged read of file and metadata table, 
looping over the following operations for each tablet until all files are 
loaded.
 ** Read the loaded files for the tablet
 ** Read the files to load for the range
 ** For any files not loaded, send an async load message to the tablet server

The above algorithm can just keep scanning the metadata table and sending async 
load messages until the bulk import is complete.  Since the load messages are 
async, the bulk load could of a large number of files could potentially be very 
fast.

The bulk load operation can easily handle the case of tablets splitting during 
the operation by matching a single range in the file to multiple tablets.  
However attempting to handle merges would be a lot more tricky.  It would 
probably be simplest to fail the operation if a merge is detected.  The nice 
thing is that this can be done in a very clean way.   Once the bulk import 
operation has the table lock, merges can not happen.  So after getting the 
table lock the bulk import operation can ensure all splits in the file exist in 
the table. The operation can abort if the condition is not met before doing any 
work.  If this condition is not met, it indicates a merge happened between 
generating the mapping file an doing the bulk import.

Hopefully the mapping file plus the algorithm that sends async load messages 
can dramatically speed up bulk import operations.  This may lessen the need for 
other things like prioritizing bulk import.  To measure this, it would be very 
useful create a bulk import performance test that can create many files with 
very little data and measure the time it takes load them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4808) Add splits to table at table creation.

2018-02-09 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359034#comment-16359034
 ] 

Keith Turner commented on ACCUMULO-4808:


Create table is a FATE operation. FATE operation are persisted in zookeeper and 
therefore should be small.  So it would not be good to include table splits in 
a FATE repo as this could be a large amount of data.  One possible way to avoid 
this is to store the split points in a file in HDFS before the FATE op is 
started.  The master could do this and store it in a accumulo dir in dfs (just 
randomly pick a volume).  The FATE repo would then only need to store the file 
path in ZK.

> Add splits to table at table creation.
> --
>
> Key: ACCUMULO-4808
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4808
> Project: Accumulo
>  Issue Type: Sub-task
>  Components: master, tserver
>Reporter: Mark Owens
>Assignee: Mark Owens
>Priority: Major
> Fix For: 2.0.0
>
>
> Add capability to add table splits at table creation. Recent changes now 
> allow iterator and locality groups to be created at table creation. Do the 
> same with splits. Comment below from 
> [ACCUMULO-4806|https://issues.apache.org/jira/browse/ACCUMULO-4806] explains 
> the motivation for the request:
> {quote}[~etcoleman] added a comment - 2 hours ago
> It would go al long way if the splits could be added at table creation or 
> when table is offline.  When the other API changes were made by Mark, I 
> wondered if this task could also could be done at that time - but I believe 
> that it was more complicated.
> The delay is that when a table is created and then the splits added and then 
> taken offline there is a period proportional to the number of splits as they 
> are off-loaded from the tserver where they originally got assigned.  (The 
> re-online with splits distributed across the cluster is quite fast)
> If the splits could be added at table creation, or while the table is offline 
> so that the delay for shedding the tablets could be avoided, then the need to 
> perform the actual import offline would not be as necessary.
>  
> {quote}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ACCUMULO-4806) Allow offline bulk imports

2018-02-09 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359022#comment-16359022
 ] 

Keith Turner commented on ACCUMULO-4806:


[~etcoleman] do you know if there is an issue for supplying splits at table 
creation time?  I have some ideas about how it could be implemented.

> Allow offline bulk imports
> --
>
> Key: ACCUMULO-4806
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4806
> Project: Accumulo
>  Issue Type: Sub-task
>  Components: master, tserver
>Reporter: Mark Owens
>Assignee: Michael Miller
>Priority: Major
> Fix For: 2.0.0
>
>
> Allowing offline bulk imports would be useful for some customers. Currently 
> these customers already take tables offline to set split points but then have 
> to bring them back online before starting the import.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4811) Session manager does not always act on cleanUp() return

2018-02-09 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4811:
---
Fix Version/s: 2.0.0
   1.9.0
   1.7.4

> Session manager does not always act on cleanUp() return
> ---
>
> Key: ACCUMULO-4811
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4811
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Priority: Major
> Fix For: 1.7.4, 1.9.0, 2.0.0
>
>
> While working on ACCUMULO-4782 I noticed that the session manager does not 
> always look at the return value of session.cleanUp().  It seems like it 
> should always do something when false is returned. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4811) Session manager does not always act on cleanUp() return

2018-02-09 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4811:
---
Issue Type: Bug  (was: Improvement)

> Session manager does not always act on cleanUp() return
> ---
>
> Key: ACCUMULO-4811
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4811
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Priority: Major
> Fix For: 1.7.4, 1.9.0, 2.0.0
>
>
> While working on ACCUMULO-4782 I noticed that the session manager does not 
> always look at the return value of session.cleanUp().  It seems like it 
> should always do something when false is returned. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4811) Session manager does not always act on cleanUp() return

2018-02-09 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4811:
---
Affects Version/s: 1.7.3
   1.8.1

> Session manager does not always act on cleanUp() return
> ---
>
> Key: ACCUMULO-4811
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4811
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Priority: Major
>
> While working on ACCUMULO-4782 I noticed that the session manager does not 
> always look at the return value of session.cleanUp().  It seems like it 
> should always do something when false is returned. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ACCUMULO-4811) Session manager does not always act on cleanUp() return

2018-02-09 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4811:
--

 Summary: Session manager does not always act on cleanUp() return
 Key: ACCUMULO-4811
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4811
 Project: Accumulo
  Issue Type: Improvement
Reporter: Keith Turner


While working on ACCUMULO-4782 I noticed that the session manager does not 
always look at the return value of session.cleanUp().  It seems like it should 
always do something when false is returned. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ACCUMULO-4810) Make session manager reservations more strict

2018-02-09 Thread Keith Turner (JIRA)
Keith Turner created ACCUMULO-4810:
--

 Summary: Make session manager reservations more strict
 Key: ACCUMULO-4810
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4810
 Project: Accumulo
  Issue Type: Improvement
Reporter: Keith Turner
 Fix For: 2.0.0


While working on ACCUMULO-4782 I noticed that the session manager was not 
strict for the following cases.
 * Removing a reserved session
 * Unreserving a removed session

For ACCUMULO-4782 I wanted to preserve existing behavior and switch to a 
concurrent map.  I think it would be nice to make reservations more strict and 
work through any bugs this causes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ACCUMULO-4809) Session manager clean up can happen when lock held.

2018-02-09 Thread Keith Turner (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Keith Turner updated ACCUMULO-4809:
---
Fix Version/s: 2.0.0
   1.9.0

> Session manager clean up can happen when lock held.
> ---
>
> Key: ACCUMULO-4809
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4809
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Keith Turner
>Priority: Critical
> Fix For: 1.7.4, 1.9.0, 2.0.0
>
>
> While working on [PR #382|https://github.com/apache/accumulo/pull/382] for 
> ACCUMULO-4782 I noticed a significant concurrency bug.  Before #382 their was 
> a single lock for the session manager. The session manager will clean up idle 
> sessions.  This clean up should happen outside the session manager lock, 
> because all tserver read/write operation use the session manger so it should 
> be responsive.
> The bug is the following.
>  * Both getActiveScansPerTable() and getActiveScans() lock the session 
> manager and then lock idleSessions.  See [SessionManager line 
> 233|https://github.com/apache/accumulo/blob/rel/1.7.3/server/tserver/src/main/java/org/apache/accumulo/tserver/session/SessionManager.java#L233]
>  
>  * The sweep() method locks idleSessions and does cleanup while this lock is 
> held. [See SessionManager 
> 200|https://github.com/apache/accumulo/blob/rel/1.7.3/server/tserver/src/main/java/org/apache/accumulo/tserver/session/SessionManager.java#L200]
>  
> Therefore it is possible for getActiveScansPerTable() or getActiveScans() to 
> lock the session manager and then block trying to lock idleSessions while 
> cleanup is happening in sweep().  This will block all access to the session 
> manager while cleanup happens.
> The changes in #382 will fix this for 1.9.0 and 2.0.0.  However I Am not sure 
> about backporting #382 to 1.7.  A more targeted fix could be made for 1.7 or 
> #382 could be backported.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >