[jira] [Commented] (ACCUMULO-4851) WAL recovery directory should be deleted before running LogSorter

2018-04-09 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431475#comment-16431475
 ] 

Dave Marion commented on ACCUMULO-4851:
---

I don't remember an issue like this. Sorry I couldn't be of any help here.

> WAL recovery directory should be deleted before running LogSorter
> -
>
> Key: ACCUMULO-4851
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4851
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Critical
> Fix For: 1.9.0
>
>
> Noticed this one on a user's 1.7-ish system.
> A number of tablets (~9) were unassigned and reported on the Monitor as 
> having failed to load. Digging into the exception, we could see the tablet 
> load failed due to a FileNotFoundException:
> {noformat}
> 2018-04-09 19:57:08,475 [tserver.TabletServer] WARN : exception trying to 
> assign tablet xk;... /accumulo/tables/xk/t-00pyzd0
> java.lang.RuntimeException: java.io.IOException: 
> java.io.FileNotFoundException: File does not exist: 
> /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
>     at org.apache.accumulo.tserver.tablet.Tablet.(Tablet.java:640)
>     at org.apache.accumulo.tserver.tablet.Tablet.(Tablet.java:449)
>     at 
> org.apache.accumulo.tserver.TabletServer$AssignmentHandler.run(TabletServer.java:2156)
>     at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>     at 
> org.apache.accumulo.tserver.ActiveAssignmentRunnable.run(ActiveAssignmentRunnable.java:61)
>     at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>     at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: java.io.FileNotFoundException: File does not 
> exist: /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
>     at 
> org.apache.accumulo.tserver.log.TabletServerLogger.recover(TabletServerLogger.java:480)
>     at 
> org.apache.accumulo.tserver.TabletServer.recover(TabletServer.java:3012)
>     at org.apache.accumulo.tserver.tablet.Tablet.(Tablet.java:590)
>     ... 9 more
> Caused by: java.io.FileNotFoundException: File does not exist: 
> /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1446)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438)
>     at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>     at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1454)
>     at 
> org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1823)
>     at 
> org.apache.hadoop.io.MapFile$Reader.createDataFileReader(MapFile.java:456)
>     at org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:429)
>     at org.apache.hadoop.io.MapFile$Reader.(MapFile.java:399)
>     at 
> org.apache.accumulo.tserver.log.MultiReader.(MultiReader.java:113)
>     at 
> org.apache.accumulo.tserver.log.SortedLogRecovery.recover(SortedLogRecovery.java:105)
>     at 
> org.apache.accumulo.tserver.log.TabletServerLogger.recover(TabletServerLogger.java:478)
>     ... 11 more
> 2018-04-09 19:57:08,476 [tserver.TabletServer] WARN : java.io.IOException: 
> java.io.FileNotFoundException: File does not exist: 
> /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed/data
> 2018-04-09 19:57:08,476 [tserver.TabletServer] WARN : failed to open tablet 
> xk;... reporting failure to master
> 2018-04-09 19:57:08,476 [tserver.TabletServer] WARN : rescheduling tablet 
> load in 600.00 seconds
> {noformat}
> Upon further investigation of the recovery directory in HDFS for this WAL, we 
> find the following:
> {noformat}
> $ hdfs dfs -ls -R /accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/
> -rwxr--r--   3 accumulo hdfs  0 2018-04-06 22:12 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/failed
> -rwxr--r--   3 accumulo hdfs  0 2018-04-06 22:10 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/finished
> drwxr-xr-x   - accumulo hdfs  0 2018-04-06 22:09 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/part-r-0
> -rw-r--r--   3 accumulo hdfs    8040761 2018-04-06 22:09 
> accumulo/recovery/0421c824-5e48-4bad-917a-b54a34a45849/part-r-0/data
> -rw-r--r--   3 

[jira] [Resolved] (ACCUMULO-4664) Allow user to specify which tablet servers will be used to process bulk import files

2017-06-22 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4664.
---
Resolution: Fixed

> Allow user to specify which tablet servers will be used to process bulk 
> import files
> 
>
> Key: ACCUMULO-4664
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4664
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master
>Reporter: Dave Marion
>Assignee: Dave Marion
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a bulk import occurs the Master assigns a the import of a single file in 
> the bulk import directory to a TabletServer. Currently, this is done randomly.
> The fix for this issue will allow a property to be set that will constrain 
> the set of TabletServers that the Master will choose from.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ACCUMULO-4664) Allow user to specify which tablet servers will be used to process bulk import files

2017-06-22 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion updated ACCUMULO-4664:
--
Fix Version/s: 2.0.0

> Allow user to specify which tablet servers will be used to process bulk 
> import files
> 
>
> Key: ACCUMULO-4664
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4664
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master
>Reporter: Dave Marion
>Assignee: Dave Marion
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a bulk import occurs the Master assigns a the import of a single file in 
> the bulk import directory to a TabletServer. Currently, this is done randomly.
> The fix for this issue will allow a property to be set that will constrain 
> the set of TabletServers that the Master will choose from.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ACCUMULO-4664) Allow user to specify which tablet servers will be used to process bulk import files

2017-06-21 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4664:
-

 Summary: Allow user to specify which tablet servers will be used 
to process bulk import files
 Key: ACCUMULO-4664
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4664
 Project: Accumulo
  Issue Type: Improvement
  Components: master
Reporter: Dave Marion
Assignee: Dave Marion


When a bulk import occurs the Master assigns a the import of a single file in 
the bulk import directory to a TabletServer. Currently, this is done randomly.

The fix for this issue will allow a property to be set that will constrain the 
set of TabletServers that the Master will choose from.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (ACCUMULO-4658) Make HostRegexTableLoadBalancer less chatty with ZooKeeper

2017-06-19 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4658.
---
   Resolution: Fixed
Fix Version/s: 2.0.0
   1.8.2
   1.7.4

> Make HostRegexTableLoadBalancer less chatty with ZooKeeper
> --
>
> Key: ACCUMULO-4658
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4658
> Project: Accumulo
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 1.7.3, 1.8.1
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Minor
> Fix For: 1.7.4, 1.8.2, 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ACCUMULO-4658) Make HostRegexTableLoadBalancer less chatty with ZooKeeper

2017-06-19 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4658:
-

 Summary: Make HostRegexTableLoadBalancer less chatty with ZooKeeper
 Key: ACCUMULO-4658
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4658
 Project: Accumulo
  Issue Type: Improvement
  Components: master
Affects Versions: 1.8.1, 1.7.3
Reporter: Dave Marion
Assignee: Dave Marion
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (ACCUMULO-4647) Create contributors guide

2017-06-13 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4647.
---
Resolution: Fixed

> Create contributors guide
> -
>
> Key: ACCUMULO-4647
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4647
> Project: Accumulo
>  Issue Type: Task
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Minor
> Fix For: 2.0.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (ACCUMULO-4647) Create contributors guide

2017-06-06 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4647:
-

 Summary: Create contributors guide
 Key: ACCUMULO-4647
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4647
 Project: Accumulo
  Issue Type: Task
Reporter: Dave Marion
Assignee: Dave Marion
Priority: Minor
 Fix For: 2.0.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (ACCUMULO-4463) Make caching implementation configurable

2017-05-22 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4463.
---
Resolution: Fixed

> Make caching implementation configurable
> 
>
> Key: ACCUMULO-4463
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4463
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Dave Marion
> Fix For: 2.0.0
>
>  Time Spent: 14.5h
>  Remaining Estimate: 0h
>
> It would be nice to make the caching implementation configurable. 
> ACCUMULO-4177 introduced a new cache type.  Instead of Accumulo having a list 
> of built in cache implementations, it could have a configuration property for 
> specifying a block cache factory class.  Accumulo could ship with multiple 
> implementations of this as it does after 4177 and allow users to easily 
> experiment with implementations that Accumulo did not ship with.
> It would be nice to have ACCUMULO-3384, so that these custom cache impls 
> could use that for custom config.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (ACCUMULO-4463) Make caching implementation configurable

2017-05-08 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion reassigned ACCUMULO-4463:
-

Assignee: Dave Marion

> Make caching implementation configurable
> 
>
> Key: ACCUMULO-4463
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4463
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Keith Turner
>Assignee: Dave Marion
> Fix For: 2.0.0
>
>
> It would be nice to make the caching implementation configurable. 
> ACCUMULO-4177 introduced a new cache type.  Instead of Accumulo having a list 
> of built in cache implementations, it could have a configuration property for 
> specifying a block cache factory class.  Accumulo could ship with multiple 
> implementations of this as it does after 4177 and allow users to easily 
> experiment with implementations that Accumulo did not ship with.
> It would be nice to have ACCUMULO-3384, so that these custom cache impls 
> could use that for custom config.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-4626) improve cache hit rate via weak reference map

2017-05-04 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15997193#comment-15997193
 ] 

Dave Marion commented on ACCUMULO-4626:
---

Agreed. I was merely stating that Ignite has a distributed cache, but you can 
run it in local mode (non-distributed). It appears that you can access it via 
the JCache API and it's own native API, much like JBoss Infinispan. Was not 
aware of ohc, thanks for the link.

In either case, I'm wondering if we should be looking at off-heap caching in 
general.

> improve cache hit rate via weak reference map
> -
>
> Key: ACCUMULO-4626
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4626
> Project: Accumulo
>  Issue Type: Improvement
>  Components: tserver
>Reporter: Adam Fuchs
>  Labels: performance, stability
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When a single iterator tree references the same RFile blocks in different 
> branches we sometimes get cache misses for one iterator even though the 
> requested block is held in memory by another iterator. This is particularly 
> important when using something like the IntersectingIterator to intersect 
> many deep copies. Instead of evicting completely, keeping evicted blocks into 
> a WeakReference value map can avoid re-reading blocks that are currently 
> referenced by another deep copied source iterator.
> We've seen this in the field for some of Sqrrl's queries against very large 
> tablets. The total memory usage for these queries can be equal to the size of 
> all the iterator block reads times the number of readahead threads times the 
> number of files times the number of IntersectingIterator children when cache 
> miss rates are high. This might work out to something like:
> {code}
> 16 readahead threads * 200 deep copied children * 99% cache miss rate * 20 
> files * 252KB per reader = ~16GB of memory
> {code}
> In most cases, evicting to a weak reference value map changes the cache miss 
> rate from very high to very low and has a dramatic effect on total memory 
> usage.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-4626) improve cache hit rate via weak reference map

2017-05-04 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15997128#comment-15997128
 ] 

Dave Marion commented on ACCUMULO-4626:
---

Curious if anyone has looked at / thought of replacing the block cache in the 
TabletServer with Apache Ignite. Ignite has a distributed cache that you can 
run in local mode, which stores data off-heap (on-heap is configurable), has 
eviction, etc. 

https://apacheignite.readme.io/docs/data-grid


> improve cache hit rate via weak reference map
> -
>
> Key: ACCUMULO-4626
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4626
> Project: Accumulo
>  Issue Type: Improvement
>  Components: tserver
>Reporter: Adam Fuchs
>  Labels: performance, stability
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When a single iterator tree references the same RFile blocks in different 
> branches we sometimes get cache misses for one iterator even though the 
> requested block is held in memory by another iterator. This is particularly 
> important when using something like the IntersectingIterator to intersect 
> many deep copies. Instead of evicting completely, keeping evicted blocks into 
> a WeakReference value map can avoid re-reading blocks that are currently 
> referenced by another deep copied source iterator.
> We've seen this in the field for some of Sqrrl's queries against very large 
> tablets. The total memory usage for these queries can be equal to the size of 
> all the iterator block reads times the number of readahead threads times the 
> number of files times the number of IntersectingIterator children when cache 
> miss rates are high. This might work out to something like:
> {code}
> 16 readahead threads * 200 deep copied children * 99% cache miss rate * 20 
> files * 252KB per reader = ~16GB of memory
> {code}
> In most cases, evicting to a weak reference value map changes the cache miss 
> rate from very high to very low and has a dramatic effect on total memory 
> usage.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-3521) Unused Iterator-related classes

2017-05-02 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993636#comment-15993636
 ] 

Dave Marion commented on ACCUMULO-3521:
---

For the items removed in 2.0, are they marked deprecated in 1.8.x?

> Unused Iterator-related classes
> ---
>
> Key: ACCUMULO-3521
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3521
> Project: Accumulo
>  Issue Type: Sub-task
>Reporter: Christopher Tubbs
>Assignee: Michael Miller
> Fix For: 2.0.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The following iterators and iterator-related utilities have no references at 
> all in our code; minimally, they should have unit tests (this list does not 
> include deprecated iterators):
> # OrIterator
> # FirstEntryInRowIterator (especially setNumScansBeforeSeek method is never 
> called)
> # TypedValueCombiner (especially setLossyness method is never called)
> # IteratorUtil.getMaxPriority and .findIterator methods
> # StatsCombiner (available in the examples) does not use the setRadix method
> Note: these iterators may not be considered "public API", but might still be 
> unsafe to remove the unused portions if somebody had used them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-4629) Seeking in timestamp range is slow

2017-04-25 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983664#comment-15983664
 ] 

Dave Marion commented on ACCUMULO-4629:
---

In the example above, do all keys have the same row, colf, and colq. You said 
omitted. If the same, is the versioning iterator removed from the table? If so, 
then delete iterator should just skip over delete keys?

Sorry so short, on mobile...

> Seeking in timestamp range is slow
> --
>
> Key: ACCUMULO-4629
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4629
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
> Fix For: 2.0.0
>
>
> Fluo's internal schema uses the first 4 bits of the timestamp to store 
> different types of information per column.  These first 4 bits divide the 
> timestamp space up into 16 ranges.  Fluo has server side iterators that 
> consider information in one of these 16 ranges and then seek forward to 
> another of the 16 ranges.
> Unfortunately, Accumulo's built in iterator that processes delete marker 
> makes seeking within the timestamps of a column an O(N^2) operation.  This is 
> because of the way deletes work in Accumulo.  A delete marker for timestamp X 
> in Accumulo deletes anything with a timestamp <= X.  
> When seeking to timestamp Y, the Accumulo iterator that handles deletes will 
> scan from MAX_LONG to Y looking for any deletes that may keep you from seeing 
> data at timestamp Y.  The following example shows what the delete iterator 
> will do when a user iterator does some seeks.
>  * User iterator seeks to stamp 1,000,000.  This causes the delete iter to 
> scan from MAX_LONG to 1,000,000 looking for delete markers.
>  * User iterator seeks to stamp 900,000.  This causes the delete iter to scan 
> from MAX_LONG to 900,000 looking for delete markers.
>  * User iterator seeks to stamp 500,000.  This causes the delete iter to scan 
> from MAX_LONG to 500,000 looking for delete markers.
> So Fluo can seek efficiently, it has done some [serious 
> shenanigans|https://github.com/apache/incubator-fluo/blob/rel/fluo-1.0.0-incubating/modules/accumulo/src/main/java/org/apache/fluo/accumulo/iterators/TimestampSkippingIterator.java#L164]
>  using reflection to remove  the DeleteIterator.  The great work being done 
> on ACCUMULO-3079 will likely break this crazy reflection code.  So I would 
> like to make a change to upstream Accumulo that allows efficient seeking in 
> the timestamp range.  I have thought of the following two possible solutions.
>  * Make the DeleteIterator stateful so that it remember what ranges it has 
> scanned for deletes.  I don't really like this solution because it will add 
> expense to every seek in Accumulo for an edge case.
>  * Make it possible to create tables with an exact delete behavior. Meaning a 
> delete for timestamp X will only delete an existing row column with that 
> exact timestamp.  This option could only be chosen at table creation time and 
> could not be changed.  For this delete behavior, the delete iterator doesnot 
> need to scan for every seek.
> Are there other possible solutions?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-4629) Seeking in timestamp range is slow

2017-04-25 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983556#comment-15983556
 ] 

Dave Marion commented on ACCUMULO-4629:
---

Maybe my option was poorly named. Can we supply a flag at scan time such that 
the DeleteIterator does not:

bq. User iterator seeks to stamp 1,000,000. This causes the delete iter to 
scan from MAX_LONG to 1,000,000 looking for delete markers.
bq. User iterator seeks to stamp 900,000. This causes the delete iter to 
scan from MAX_LONG to 900,000 looking for delete markers.
bq. User iterator seeks to stamp 500,000. This causes the delete iter to 
scan from MAX_LONG to 500,000 looking for delete markers.

But instead just seeks to the first non-delete key and does not scan for delete 
markers? I don't think this would propagate the delete keys up the iterator 
stack, it would skip over them. Maybe I'm misunderstanding something.

> Seeking in timestamp range is slow
> --
>
> Key: ACCUMULO-4629
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4629
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
> Fix For: 2.0.0
>
>
> Fluo's internal schema uses the first 4 bits of the timestamp to store 
> different types of information per column.  These first 4 bits divide the 
> timestamp space up into 16 ranges.  Fluo has server side iterators that 
> consider information in one of these 16 ranges and then seek forward to 
> another of the 16 ranges.
> Unfortunately, Accumulo's built in iterator that processes delete marker 
> makes seeking within the timestamps of a column an O(N^2) operation.  This is 
> because of the way deletes work in Accumulo.  A delete marker for timestamp X 
> in Accumulo deletes anything with a timestamp <= X.  
> When seeking to timestamp Y, the Accumulo iterator that handles deletes will 
> scan from MAX_LONG to Y looking for any deletes that may keep you from seeing 
> data at timestamp Y.  The following example shows what the delete iterator 
> will do when a user iterator does some seeks.
>  * User iterator seeks to stamp 1,000,000.  This causes the delete iter to 
> scan from MAX_LONG to 1,000,000 looking for delete markers.
>  * User iterator seeks to stamp 900,000.  This causes the delete iter to scan 
> from MAX_LONG to 900,000 looking for delete markers.
>  * User iterator seeks to stamp 500,000.  This causes the delete iter to scan 
> from MAX_LONG to 500,000 looking for delete markers.
> So Fluo can seek efficiently, it has done some [serious 
> shenanigans|https://github.com/apache/incubator-fluo/blob/rel/fluo-1.0.0-incubating/modules/accumulo/src/main/java/org/apache/fluo/accumulo/iterators/TimestampSkippingIterator.java#L164]
>  using reflection to remove  the DeleteIterator.  The great work being done 
> on ACCUMULO-3079 will likely break this crazy reflection code.  So I would 
> like to make a change to upstream Accumulo that allows efficient seeking in 
> the timestamp range.  I have thought of the following two possible solutions.
>  * Make the DeleteIterator stateful so that it remember what ranges it has 
> scanned for deletes.  I don't really like this solution because it will add 
> expense to every seek in Accumulo for an edge case.
>  * Make it possible to create tables with an exact delete behavior. Meaning a 
> delete for timestamp X will only delete an existing row column with that 
> exact timestamp.  This option could only be chosen at table creation time and 
> could not be changed.  For this delete behavior, the delete iterator doesnot 
> need to scan for every seek.
> Are there other possible solutions?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-4629) Seeking in timestamp range is slow

2017-04-25 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983474#comment-15983474
 ] 

Dave Marion commented on ACCUMULO-4629:
---

bq. Scanner.disableDeleteIterator();

I like this better than the two options that you suggested in the description. 
I have not looked to see if it's possible to do.

> Seeking in timestamp range is slow
> --
>
> Key: ACCUMULO-4629
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4629
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
> Fix For: 2.0.0
>
>
> Fluo's internal schema uses the first 4 bits of the timestamp to store 
> different types of information per column.  These first 4 bits divide the 
> timestamp space up into 16 ranges.  Fluo has server side iterators that 
> consider information in one of these 16 ranges and then seek forward to 
> another of the 16 ranges.
> Unfortunately, Accumulo's built in iterator that processes delete marker 
> makes seeking within the timestamps of a column an O(N^2) operation.  This is 
> because of the way deletes work in Accumulo.  A delete marker for timestamp X 
> in Accumulo deletes anything with a timestamp <= X.  
> When seeking to timestamp Y, the Accumulo iterator that handles deletes will 
> scan from MAX_LONG to Y looking for any deletes that may keep you from seeing 
> data at timestamp Y.  The following example shows what the delete iterator 
> will do when a user iterator does some seeks.
>  * User iterator seeks to stamp 1,000,000.  This causes the delete iter to 
> scan from MAX_LONG to 1,000,000 looking for delete markers.
>  * User iterator seeks to stamp 900,000.  This causes the delete iter to scan 
> from MAX_LONG to 900,000 looking for delete markers.
>  * User iterator seeks to stamp 500,000.  This causes the delete iter to scan 
> from MAX_LONG to 500,000 looking for delete markers.
> So Fluo can seek efficiently, it has done some [serious 
> shenanigans|https://github.com/apache/incubator-fluo/blob/rel/fluo-1.0.0-incubating/modules/accumulo/src/main/java/org/apache/fluo/accumulo/iterators/TimestampSkippingIterator.java#L164]
>  using reflection to remove  the DeleteIterator.  The great work being done 
> on ACCUMULO-3079 will likely break this crazy reflection code.  So I would 
> like to make a change to upstream Accumulo that allows efficient seeking in 
> the timestamp range.  I have thought of the following two possible solutions.
>  * Make the DeleteIterator stateful so that it remember what ranges it has 
> scanned for deletes.  I don't really like this solution because it will add 
> expense to every seek in Accumulo for an edge case.
>  * Make it possible to create tables with an exact delete behavior. Meaning a 
> delete for timestamp X will only delete an existing row column with that 
> exact timestamp.  This option could only be chosen at table creation time and 
> could not be changed.  For this delete behavior, the delete iterator doesnot 
> need to scan for every seek.
> Are there other possible solutions?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-4629) Seeking in timestamp range is slow

2017-04-25 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983355#comment-15983355
 ] 

Dave Marion commented on ACCUMULO-4629:
---

Can we somehow pass a scan option to disable delete marker checking and seek to 
the first non-delete key?

> Seeking in timestamp range is slow
> --
>
> Key: ACCUMULO-4629
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4629
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
> Fix For: 2.0.0
>
>
> Fluo's internal schema uses the first 4 bits of the timestamp to store 
> different types of information per column.  These first 4 bits divide the 
> timestamp space up into 16 ranges.  Fluo has server side iterators that 
> consider information in one of these 16 ranges and then seek forward to 
> another of the 16 ranges.
> Unfortunately, Accumulo's built in iterator that processes delete marker 
> makes seeking within the timestamps of a column an O(N^2) operation.  This is 
> because of the way deletes work in Accumulo.  A delete marker for timestamp X 
> in Accumulo deletes anything with a timestamp <= X.  
> When seeking to timestamp Y, the Accumulo iterator that handles deletes will 
> scan from MAX_LONG to Y looking for any deletes that may keep you from seeing 
> data at timestamp Y.  The following example shows what the delete iterator 
> will do when a user iterator does some seeks.
>  * User iterator seeks to stamp 1,000,000.  This causes the delete iter to 
> scan from MAX_LONG to 1,000,000 looking for delete markers.
>  * User iterator seeks to stamp 900,000.  This causes the delete iter to scan 
> from MAX_LONG to 900,000 looking for delete markers.
>  * User iterator seeks to stamp 500,000.  This causes the delete iter to scan 
> from MAX_LONG to 500,000 looking for delete markers.
> So Fluo can seek efficiently, it has done some [serious 
> shenanigans|https://github.com/apache/incubator-fluo/blob/rel/fluo-1.0.0-incubating/modules/accumulo/src/main/java/org/apache/fluo/accumulo/iterators/TimestampSkippingIterator.java#L164]
>  using reflection to remove  the DeleteIterator.  The great work being done 
> on ACCUMULO-3079 will likely break this crazy reflection code.  So I would 
> like to make a change to upstream Accumulo that allows efficient seeking in 
> the timestamp range.  I have thought of the following two possible solutions.
>  * Make the DeleteIterator stateful so that it remember what ranges it has 
> scanned for deletes.  I don't really like this solution because it will add 
> expense to every seek in Accumulo for an edge case.
>  * Make it possible to create tables with an exact delete behavior. Meaning a 
> delete for timestamp X will only delete an existing row column with that 
> exact timestamp.  This option could only be chosen at table creation time and 
> could not be changed.  For this delete behavior, the delete iterator doesnot 
> need to scan for every seek.
> Are there other possible solutions?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-4615) ThreadPool timeout when checking tserver stats is confusing

2017-03-28 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15945815#comment-15945815
 ] 

Dave Marion commented on ACCUMULO-4615:
---

[~mjwall] You got most of it, basically the monitoring was freaking out, 
showing different values for # tservers, # tablets, # offline tables. Default 
timeout is 2 * client timeout (default 3s). Not sure if increasing the client 
timeout will cause other issues, or if we should be using a different property.

> ThreadPool timeout when checking tserver stats is confusing
> ---
>
> Key: ACCUMULO-4615
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4615
> Project: Accumulo
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.8.1
>Reporter: Michael Wall
>Priority: Minor
>
> If it takes longer than the configured time to gather information from all 
> the tablet servers, the thread pool stops and processing continues with 
> whatever has been collected.  Code is 
> https://github.com/apache/accumulo/blob/1.8/server/master/src/main/java/org/apache/accumulo/master/Master.java#L1120,
>  default timeout is 6s.  Does not appear to be an issue prior to 1.8.
> Best case, this was really confusing.  The monitor page would have 30 
> tservers, then 5 tservers.  Didn't really see any other negative effects, no 
> migrations and no balancing appeared to be affected.  Worse case though, I 
> missed something and the master is making decisions based on incomplete 
> information.
> [~dlmar...@comcast.net] please add more info if needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-4597) NPE from RFile PrintInfo

2017-03-04 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15895882#comment-15895882
 ] 

Dave Marion commented on ACCUMULO-4597:
---

We restarted CI yesterday. I checked an A and C type files, both exhibit the 
same symptoms when `-d` is specified. Without `-d`, it works without error.

> NPE from RFile PrintInfo
> 
>
> Key: ACCUMULO-4597
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4597
> Project: Accumulo
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 1.8.1
>Reporter: Dave Marion
>Assignee: Keith Turner
>
> Ran continuous ingest on a 4 node cluster. Tried running rfile-info on a 
> resulting RFile.
> {noformat}
> /opt/accumulo-1.8.1/bin/accumulo rfile-info -d 
> hdfs://localhost:8020/accumulo/tables/2/t-0bq/C0005ct2.rf
> Reading file: hdfs://localhost:8020/accumulo/tables/2/t-0bq/C0005ct2.rf
> RFile Version: 8
> Locality group   : 
> Num   blocks   : 2,868
> Index level 0  : 92,114 bytes  1 blocks
> First key  : 167494e6a12f 3156:10ea [] 1488561068054 false
> Last key   : 176d22e8074a 711e:4443 [] 1488560945783 false
> Num entries: 2,672,521
> Column families: 
> Meta block : BCFile.index
>   Raw size : 4 bytes
>   Compressed size  : 12 bytes
>   Compression type : gz
> Meta block : RFile.index
>   Raw size : 92,190 bytes
>   Compressed size  : 44,822 bytes
>   Compression type : gz
> 2017-03-03 11:43:10,451 [start.Main] ERROR: Thread 'rfile-info' died.
> java.lang.NullPointerException
> at 
> org.apache.accumulo.core.file.rfile.RFile$Reader.getLocalityGroupCF(RFile.java:1300)
> at 
> org.apache.accumulo.core.file.rfile.PrintInfo.execute(PrintInfo.java:165)
> at org.apache.accumulo.start.Main$1.run(Main.java:120)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (ACCUMULO-4597) NPE from RFile PrintInfo

2017-03-03 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion updated ACCUMULO-4597:
--
Description: 
Ran continuous ingest on a 4 node cluster. Tried running rfile-info on a 
resulting RFile.

{noformat}
/opt/accumulo-1.8.1/bin/accumulo rfile-info -d 
hdfs://localhost:8020/accumulo/tables/2/t-0bq/C0005ct2.rf
Reading file: hdfs://localhost:8020/accumulo/tables/2/t-0bq/C0005ct2.rf
RFile Version: 8

Locality group   : 
Num   blocks   : 2,868
Index level 0  : 92,114 bytes  1 blocks
First key  : 167494e6a12f 3156:10ea [] 1488561068054 false
Last key   : 176d22e8074a 711e:4443 [] 1488560945783 false
Num entries: 2,672,521
Column families: 

Meta block : BCFile.index
  Raw size : 4 bytes
  Compressed size  : 12 bytes
  Compression type : gz

Meta block : RFile.index
  Raw size : 92,190 bytes
  Compressed size  : 44,822 bytes
  Compression type : gz

2017-03-03 11:43:10,451 [start.Main] ERROR: Thread 'rfile-info' died.
java.lang.NullPointerException
at 
org.apache.accumulo.core.file.rfile.RFile$Reader.getLocalityGroupCF(RFile.java:1300)
at org.apache.accumulo.core.file.rfile.PrintInfo.execute(PrintInfo.java:165)
at org.apache.accumulo.start.Main$1.run(Main.java:120)
at java.lang.Thread.run(Thread.java:745)

{noformat}

  was:
Ran continuous ingest on a 4 node cluster. Tried running rfile-info on a 
resulting RFile.

```
/opt/accumulo-1.8.1/bin/accumulo rfile-info -d 
hdfs://localhost:8020/accumulo/tables/2/t-0bq/C0005ct2.rf
Reading file: hdfs://localhost:8020/accumulo/tables/2/t-0bq/C0005ct2.rf
RFile Version: 8

Locality group   : 
Num   blocks   : 2,868
Index level 0  : 92,114 bytes  1 blocks
First key  : 167494e6a12f 3156:10ea [] 1488561068054 false
Last key   : 176d22e8074a 711e:4443 [] 1488560945783 false
Num entries: 2,672,521
Column families: 

Meta block : BCFile.index
  Raw size : 4 bytes
  Compressed size  : 12 bytes
  Compression type : gz

Meta block : RFile.index
  Raw size : 92,190 bytes
  Compressed size  : 44,822 bytes
  Compression type : gz

2017-03-03 11:43:10,451 [start.Main] ERROR: Thread 'rfile-info' died.
java.lang.NullPointerException
at 
org.apache.accumulo.core.file.rfile.RFile$Reader.getLocalityGroupCF(RFile.java:1300)
at org.apache.accumulo.core.file.rfile.PrintInfo.execute(PrintInfo.java:165)
at org.apache.accumulo.start.Main$1.run(Main.java:120)
at java.lang.Thread.run(Thread.java:745)

```


> NPE from RFile PrintInfo
> 
>
> Key: ACCUMULO-4597
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4597
> Project: Accumulo
>  Issue Type: Bug
>  Components: scripts
>Affects Versions: 1.8.1
>Reporter: Dave Marion
>
> Ran continuous ingest on a 4 node cluster. Tried running rfile-info on a 
> resulting RFile.
> {noformat}
> /opt/accumulo-1.8.1/bin/accumulo rfile-info -d 
> hdfs://localhost:8020/accumulo/tables/2/t-0bq/C0005ct2.rf
> Reading file: hdfs://localhost:8020/accumulo/tables/2/t-0bq/C0005ct2.rf
> RFile Version: 8
> Locality group   : 
> Num   blocks   : 2,868
> Index level 0  : 92,114 bytes  1 blocks
> First key  : 167494e6a12f 3156:10ea [] 1488561068054 false
> Last key   : 176d22e8074a 711e:4443 [] 1488560945783 false
> Num entries: 2,672,521
> Column families: 
> Meta block : BCFile.index
>   Raw size : 4 bytes
>   Compressed size  : 12 bytes
>   Compression type : gz
> Meta block : RFile.index
>   Raw size : 92,190 bytes
>   Compressed size  : 44,822 bytes
>   Compression type : gz
> 2017-03-03 11:43:10,451 [start.Main] ERROR: Thread 'rfile-info' died.
> java.lang.NullPointerException
> at 
> org.apache.accumulo.core.file.rfile.RFile$Reader.getLocalityGroupCF(RFile.java:1300)
> at 
> org.apache.accumulo.core.file.rfile.PrintInfo.execute(PrintInfo.java:165)
> at org.apache.accumulo.start.Main$1.run(Main.java:120)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (ACCUMULO-4597) NPE from RFile PrintInfo

2017-03-03 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4597:
-

 Summary: NPE from RFile PrintInfo
 Key: ACCUMULO-4597
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4597
 Project: Accumulo
  Issue Type: Bug
  Components: scripts
Affects Versions: 1.8.1
Reporter: Dave Marion


Ran continuous ingest on a 4 node cluster. Tried running rfile-info on a 
resulting RFile.

```
/opt/accumulo-1.8.1/bin/accumulo rfile-info -d 
hdfs://localhost:8020/accumulo/tables/2/t-0bq/C0005ct2.rf
Reading file: hdfs://localhost:8020/accumulo/tables/2/t-0bq/C0005ct2.rf
RFile Version: 8

Locality group   : 
Num   blocks   : 2,868
Index level 0  : 92,114 bytes  1 blocks
First key  : 167494e6a12f 3156:10ea [] 1488561068054 false
Last key   : 176d22e8074a 711e:4443 [] 1488560945783 false
Num entries: 2,672,521
Column families: 

Meta block : BCFile.index
  Raw size : 4 bytes
  Compressed size  : 12 bytes
  Compression type : gz

Meta block : RFile.index
  Raw size : 92,190 bytes
  Compressed size  : 44,822 bytes
  Compression type : gz

2017-03-03 11:43:10,451 [start.Main] ERROR: Thread 'rfile-info' died.
java.lang.NullPointerException
at 
org.apache.accumulo.core.file.rfile.RFile$Reader.getLocalityGroupCF(RFile.java:1300)
at org.apache.accumulo.core.file.rfile.PrintInfo.execute(PrintInfo.java:165)
at org.apache.accumulo.start.Main$1.run(Main.java:120)
at java.lang.Thread.run(Thread.java:745)

```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (ACCUMULO-4576) HostRegexTableLoadBalancer assigns/balances with stale information

2017-01-25 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4576.
---
Resolution: Fixed

> HostRegexTableLoadBalancer assigns/balances with stale information
> --
>
> Key: ACCUMULO-4576
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4576
> Project: Accumulo
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.7.2, 1.8.0
>Reporter: Dave Marion
>Assignee: Dave Marion
> Fix For: 1.7.3, 1.8.1, 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The balancer maintains an internal mapping of tablet server pools and tablet 
> server status. It's updated at a configurable interval, which was done 
> initially as an optimization. Unfortunately it has the negative side effect 
> of providing the assignment and balance operations with stale information 
> about the tablet servers which leads to a constant shuffling of tablets. The 
> configuration property and optimization needs to be removed to supply the 
> assign/balance methods with the correct information every time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ACCUMULO-4576) HostRegexTableLoadBalancer assigns/balances with stale information

2017-01-25 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion updated ACCUMULO-4576:
--
Fix Version/s: 2.0.0
   1.8.1
   1.7.3

> HostRegexTableLoadBalancer assigns/balances with stale information
> --
>
> Key: ACCUMULO-4576
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4576
> Project: Accumulo
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.7.2, 1.8.0
>Reporter: Dave Marion
>Assignee: Dave Marion
> Fix For: 1.7.3, 1.8.1, 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The balancer maintains an internal mapping of tablet server pools and tablet 
> server status. It's updated at a configurable interval, which was done 
> initially as an optimization. Unfortunately it has the negative side effect 
> of providing the assignment and balance operations with stale information 
> about the tablet servers which leads to a constant shuffling of tablets. The 
> configuration property and optimization needs to be removed to supply the 
> assign/balance methods with the correct information every time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ACCUMULO-4576) HostRegexTableLoadBalancer assigns/balances with stale information

2017-01-24 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4576:
-

 Summary: HostRegexTableLoadBalancer assigns/balances with stale 
information
 Key: ACCUMULO-4576
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4576
 Project: Accumulo
  Issue Type: Bug
  Components: master
Affects Versions: 1.8.0, 1.7.2
Reporter: Dave Marion
Assignee: Dave Marion


The balancer maintains an internal mapping of tablet server pools and tablet 
server status. It's updated at a configurable interval, which was done 
initially as an optimization. Unfortunately it has the negative side effect of 
providing the assignment and balance operations with stale information about 
the tablet servers which leads to a constant shuffling of tablets. The 
configuration property and optimization needs to be removed to supply the 
assign/balance methods with the correct information every time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4535) HostRegexTableLoadBalancer fails with NullPointerException

2016-12-15 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752262#comment-15752262
 ] 

Dave Marion commented on ACCUMULO-4535:
---

Also, is this the same as ACCUMULO-4196?

> HostRegexTableLoadBalancer fails with NullPointerException
> --
>
> Key: ACCUMULO-4535
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4535
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.2
>Reporter: Adam J Shook
>Assignee: Adam J Shook
>
> Ran into an issue when testing out the {{HostRegexTableLoadBalancer}}.  It 
> fails to start and throws a {{NullPointerException}} when trying to access 
> the {{poolNameToRegexPattern}}.  The root cause is the {{init}} function the 
> load balancer overrides is the wrong one that the Master calls (wrong 
> signature).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4535) HostRegexTableLoadBalancer fails with NullPointerException

2016-12-15 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15752260#comment-15752260
 ] 

Dave Marion commented on ACCUMULO-4535:
---

[~adamjshook] Do you know if the Master changed since I added this? I'm a 
little confused, as I tested this and it did work at one point.

> HostRegexTableLoadBalancer fails with NullPointerException
> --
>
> Key: ACCUMULO-4535
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4535
> Project: Accumulo
>  Issue Type: Bug
>Affects Versions: 1.7.2
>Reporter: Adam J Shook
>Assignee: Adam J Shook
>
> Ran into an issue when testing out the {{HostRegexTableLoadBalancer}}.  It 
> fails to start and throws a {{NullPointerException}} when trying to access 
> the {{poolNameToRegexPattern}}.  The root cause is the {{init}} function the 
> load balancer overrides is the wrong one that the Master calls (wrong 
> signature).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-3590) Rolling upgrade support

2016-11-09 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15652154#comment-15652154
 ] 

Dave Marion commented on ACCUMULO-3590:
---

ACCUMULO-3569?

> Rolling upgrade support
> ---
>
> Key: ACCUMULO-3590
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3590
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Josh Elser
> Fix For: 2.0.0
>
>
> We really lack any support for rolling upgrades between versions of Accumulo.
> Need to have a way to provide seamless upgrades between certain versions of 
> Accumulo which minimize any downtime/impact on users of the system.
> Need to cover fundamental control features (like a rolling upgrade), changes 
> to our serialized data structures (help prevent us from goofing up 
> serialization across versions -- thrift/protobuf/etc), backwards/forward 
> compatible RFile and WAL support (including control), lots of tests, and lots 
> of documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-2222) add footnote to documentation regarding rfile placement in HDFS in relation to tserver assignment

2016-10-28 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15615945#comment-15615945
 ] 

Dave Marion commented on ACCUMULO-:
---

This may not be true depending on the configuration. 
https://issues.apache.org/jira/browse/HDFS-3702


> add footnote to documentation regarding rfile placement in HDFS in relation 
> to tserver assignment
> -
>
> Key: ACCUMULO-
> URL: https://issues.apache.org/jira/browse/ACCUMULO-
> Project: Accumulo
>  Issue Type: Improvement
>  Components: docs
>Reporter: Arshak Navruzyan
>
> suggest adding a footnote to "2.3.1. Tablet Server."  Perhaps something like 
> this:
> HDFS's block placement policy determines which data nodes will hold a replica 
> of the Accumulo RFile.  The default HDFS placement policy is to put one 
> replica on one node in the local rack, another on a node in a different 
> (remote) rack, and the last on a different node in the same remote rack.  
> (http://hadoop.apache.org/docs/stable1/hdfs_design.html) 
> This generally ensures that the server assigned to a particular tablet will 
> be a HDFS replication location for the tablet's underlying RFile. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-3944) Tablet attempts to split or major compact after every bulk file import

2016-10-20 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592281#comment-15592281
 ] 

Dave Marion commented on ACCUMULO-3944:
---

I think we should remove the else block from the code in the description. 
Thoughts? Regarding the system being idle and not compacting, maybe the 
TABLE_MAJC_COMPACTALL_IDLETIME property should be lowered?

> Tablet attempts to split or major compact after every bulk file import
> --
>
> Key: ACCUMULO-3944
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3944
> Project: Accumulo
>  Issue Type: Improvement
>  Components: tserver
>Reporter: Eric Newton
>Priority: Blocker
> Fix For: 2.0.0
>
>
> I noticed this bit of code in tablet, after it has bulk imported a file, but 
> before the bulk import is finished:
> {code}
>   if (needsSplit()) {
> getTabletServer().executeSplit(this);
>   } else {
> initiateMajorCompaction(MajorCompactionReason.NORMAL);
>   }
> {code}
> I'm pretty sure we can leave this to the normal tablet server mechanism for 
> deciding when to split or compact.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-2353) Test improvments to java.io.InputStream.seek() for possible Hadoop patch

2016-10-18 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586521#comment-15586521
 ] 

Dave Marion commented on ACCUMULO-2353:
---

Looks like this was fixed in Hadoop 2.8. What's the disposition for this ticket?

> Test improvments to java.io.InputStream.seek() for possible Hadoop patch
> 
>
> Key: ACCUMULO-2353
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2353
> Project: Accumulo
>  Issue Type: Task
> Environment: Java 6 update 45 or later
> Hadoop 2.2.0
>Reporter: Dave Marion
>Priority: Minor
>
> At some point (early Java 7 I think, then backported to around Java 6 Update 
> 45), the java.io.InputStream.seek() method was changed from reading byte[512] 
> to byte[2048]. The difference can be seen in DeflaterInputStream, which has 
> not been updated:
> {noformat}
> public long skip(long n) throws IOException {
> if (n < 0) {
> throw new IllegalArgumentException("negative skip length");
> }
> ensureOpen();
> // Skip bytes by repeatedly decompressing small blocks
> if (rbuf.length < 512)
> rbuf = new byte[512];
> int total = (int)Math.min(n, Integer.MAX_VALUE);
> long cnt = 0;
> while (total > 0) {
> // Read a small block of uncompressed bytes
> int len = read(rbuf, 0, (total <= rbuf.length ? total : 
> rbuf.length));
> if (len < 0) {
> break;
> }
> cnt += len;
> total -= len;
> }
> return cnt;
> }
> {noformat}
> and java.io.InputStream in Java 6 Update 45:
> {noformat}
> // MAX_SKIP_BUFFER_SIZE is used to determine the maximum buffer skip to
> // use when skipping.
> private static final int MAX_SKIP_BUFFER_SIZE = 2048;
> public long skip(long n) throws IOException {
>   long remaining = n;
>   int nr;
>   if (n <= 0) {
>   return 0;
>   }
>   
>   int size = (int)Math.min(MAX_SKIP_BUFFER_SIZE, remaining);
>   byte[] skipBuffer = new byte[size];
>   while (remaining > 0) {
>   nr = read(skipBuffer, 0, (int)Math.min(size, remaining));
>   
>   if (nr < 0) {
>   break;
>   }
>   remaining -= nr;
>   }
>   
>   return n - remaining;
> }
> {noformat}
> In sample tests I saw about a 20% improvement in skip() when seeking towards 
> the end of a locally cached compressed file. Looking at the 
> DecompressorStream in HDFS, the seek method is a near copy of the old 
> InputStream method:
> {noformat}
>   private byte[] skipBytes = new byte[512];
>   @Override
>   public long skip(long n) throws IOException {
> // Sanity checks
> if (n < 0) {
>   throw new IllegalArgumentException("negative skip length");
> }
> checkStream();
> 
> // Read 'n' bytes
> int skipped = 0;
> while (skipped < n) {
>   int len = Math.min(((int)n - skipped), skipBytes.length);
>   len = read(skipBytes, 0, len);
>   if (len == -1) {
> eof = true;
> break;
>   }
>   skipped += len;
> }
> return skipped;
>   }
> {noformat}
> This task is to evaluate the changes to DecompressorStream with a possible 
> patch to HDFS and possible bug request to Oracle to port the InputStream.seek 
> changes to DeflaterInputStream.seek



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ACCUMULO-4483) NegativeArraySizeException from scan thread right after minor compaction

2016-10-11 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4483.
---
Resolution: Fixed

> NegativeArraySizeException from scan thread right after minor compaction
> 
>
> Key: ACCUMULO-4483
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4483
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 1.8.0
> Environment: Accumulo 1.8.0
> Java 1.8.0_72
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Blocker
>  Labels: hackathon2016
> Fix For: 1.8.1, 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I have been getting NegativeArraySizeExceptions after upgrading to Accumulo 
> 1.8.0. I have tracked it down to 2 or more concurrent scans on a tablet that 
> has just undergone minor compaction. The minor compaction thread writes the 
> in-memory map to a local temporary rfile and tries to switch the current 
> iterators to use it instead of the native map. The iterator code in the scan 
> thread may also switch itself to use the local temporary rfile it if notices 
> it before the minor compaction threads performs the switch. This all works up 
> to this point. Shortly after the switch one of the iterator threads will get 
> a NegativeArraySizeException from:
> SourceSwitchingIterator.seek() -> SourceSwitchingIterator.readNext() -> 
> MemKeyConversionIterator.seek() -> MemKeyConversionIterator.getTopKeyValue() 
> ->MemValue.decode(). I added a bunch of logging to find that the length of 
> the byte array passed to MemValue.decode is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4483) NegativeArraySizeException from scan thread right after minor compaction

2016-10-11 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566931#comment-15566931
 ] 

Dave Marion commented on ACCUMULO-4483:
---

Test case added. [~ivan.bella] might add another.

> NegativeArraySizeException from scan thread right after minor compaction
> 
>
> Key: ACCUMULO-4483
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4483
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 1.8.0
> Environment: Accumulo 1.8.0
> Java 1.8.0_72
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Blocker
>  Labels: hackathon2016
> Fix For: 1.8.1, 2.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I have been getting NegativeArraySizeExceptions after upgrading to Accumulo 
> 1.8.0. I have tracked it down to 2 or more concurrent scans on a tablet that 
> has just undergone minor compaction. The minor compaction thread writes the 
> in-memory map to a local temporary rfile and tries to switch the current 
> iterators to use it instead of the native map. The iterator code in the scan 
> thread may also switch itself to use the local temporary rfile it if notices 
> it before the minor compaction threads performs the switch. This all works up 
> to this point. Shortly after the switch one of the iterator threads will get 
> a NegativeArraySizeException from:
> SourceSwitchingIterator.seek() -> SourceSwitchingIterator.readNext() -> 
> MemKeyConversionIterator.seek() -> MemKeyConversionIterator.getTopKeyValue() 
> ->MemValue.decode(). I added a bunch of logging to find that the length of 
> the byte array passed to MemValue.decode is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4483) NegativeArraySizeException from scan thread right after minor compaction

2016-10-04 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546370#comment-15546370
 ] 

Dave Marion commented on ACCUMULO-4483:
---

FWIW, I committed a fix for this and merged it up to master. I tested the fix 
and my issue disappeared. In talking with [~kturner] he suggested removing the 
call to Value.set() in MemValue.decode(). 

> NegativeArraySizeException from scan thread right after minor compaction
> 
>
> Key: ACCUMULO-4483
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4483
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 1.8.0
> Environment: Accumulo 1.8.0
> Java 1.8.0_72
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Blocker
> Fix For: 1.8.1, 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I have been getting NegativeArraySizeExceptions after upgrading to Accumulo 
> 1.8.0. I have tracked it down to 2 or more concurrent scans on a tablet that 
> has just undergone minor compaction. The minor compaction thread writes the 
> in-memory map to a local temporary rfile and tries to switch the current 
> iterators to use it instead of the native map. The iterator code in the scan 
> thread may also switch itself to use the local temporary rfile it if notices 
> it before the minor compaction threads performs the switch. This all works up 
> to this point. Shortly after the switch one of the iterator threads will get 
> a NegativeArraySizeException from:
> SourceSwitchingIterator.seek() -> SourceSwitchingIterator.readNext() -> 
> MemKeyConversionIterator.seek() -> MemKeyConversionIterator.getTopKeyValue() 
> ->MemValue.decode(). I added a bunch of logging to find that the length of 
> the byte array passed to MemValue.decode is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (ACCUMULO-4483) NegativeArraySizeException from scan thread right after minor compaction

2016-10-04 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion reassigned ACCUMULO-4483:
-

Assignee: Dave Marion

> NegativeArraySizeException from scan thread right after minor compaction
> 
>
> Key: ACCUMULO-4483
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4483
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 1.8.0
> Environment: Accumulo 1.8.0
> Java 1.8.0_72
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Blocker
> Fix For: 1.8.1, 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I have been getting NegativeArraySizeExceptions after upgrading to Accumulo 
> 1.8.0. I have tracked it down to 2 or more concurrent scans on a tablet that 
> has just undergone minor compaction. The minor compaction thread writes the 
> in-memory map to a local temporary rfile and tries to switch the current 
> iterators to use it instead of the native map. The iterator code in the scan 
> thread may also switch itself to use the local temporary rfile it if notices 
> it before the minor compaction threads performs the switch. This all works up 
> to this point. Shortly after the switch one of the iterator threads will get 
> a NegativeArraySizeException from:
> SourceSwitchingIterator.seek() -> SourceSwitchingIterator.readNext() -> 
> MemKeyConversionIterator.seek() -> MemKeyConversionIterator.getTopKeyValue() 
> ->MemValue.decode(). I added a bunch of logging to find that the length of 
> the byte array passed to MemValue.decode is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4483) NegativeArraySizeException from scan thread right after minor compaction

2016-10-04 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546322#comment-15546322
 ] 

Dave Marion commented on ACCUMULO-4483:
---

bq. The Value.set() doesn't appear to have been modified in Keith's commit 
though.

 The old code called Value.set() on the input object and didn't return 
anything. The new code called Value.set() on the input object and returned a 
new MemValue. The calling code had been changed to use the return value, and 
I'm assuming reuse the input object.

bq.  Curious that we didn't uncover this in any testing for 1.8.0. This would 
be good to understand why (apparently?) our testing was insufficient to catch 
this.

 Not sure. The issue arose on multiple clusters as soon as I upgraded to 1.8.0. 
We run continuous ingest for our tests, but are we continuously scanning during 
the ingest?

> NegativeArraySizeException from scan thread right after minor compaction
> 
>
> Key: ACCUMULO-4483
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4483
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 1.8.0
> Environment: Accumulo 1.8.0
> Java 1.8.0_72
>Reporter: Dave Marion
>Priority: Blocker
> Fix For: 1.8.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> I have been getting NegativeArraySizeExceptions after upgrading to Accumulo 
> 1.8.0. I have tracked it down to 2 or more concurrent scans on a tablet that 
> has just undergone minor compaction. The minor compaction thread writes the 
> in-memory map to a local temporary rfile and tries to switch the current 
> iterators to use it instead of the native map. The iterator code in the scan 
> thread may also switch itself to use the local temporary rfile it if notices 
> it before the minor compaction threads performs the switch. This all works up 
> to this point. Shortly after the switch one of the iterator threads will get 
> a NegativeArraySizeException from:
> SourceSwitchingIterator.seek() -> SourceSwitchingIterator.readNext() -> 
> MemKeyConversionIterator.seek() -> MemKeyConversionIterator.getTopKeyValue() 
> ->MemValue.decode(). I added a bunch of logging to find that the length of 
> the byte array passed to MemValue.decode is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4483) NegativeArraySizeException from scan thread right after minor compaction

2016-10-04 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546264#comment-15546264
 ] 

Dave Marion commented on ACCUMULO-4483:
---

Not using sampling. Didn't know it existed until now.

> NegativeArraySizeException from scan thread right after minor compaction
> 
>
> Key: ACCUMULO-4483
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4483
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 1.8.0
> Environment: Accumulo 1.8.0
> Java 1.8.0_72
>Reporter: Dave Marion
>Priority: Blocker
> Fix For: 1.8.1
>
>
> I have been getting NegativeArraySizeExceptions after upgrading to Accumulo 
> 1.8.0. I have tracked it down to 2 or more concurrent scans on a tablet that 
> has just undergone minor compaction. The minor compaction thread writes the 
> in-memory map to a local temporary rfile and tries to switch the current 
> iterators to use it instead of the native map. The iterator code in the scan 
> thread may also switch itself to use the local temporary rfile it if notices 
> it before the minor compaction threads performs the switch. This all works up 
> to this point. Shortly after the switch one of the iterator threads will get 
> a NegativeArraySizeException from:
> SourceSwitchingIterator.seek() -> SourceSwitchingIterator.readNext() -> 
> MemKeyConversionIterator.seek() -> MemKeyConversionIterator.getTopKeyValue() 
> ->MemValue.decode(). I added a bunch of logging to find that the length of 
> the byte array passed to MemValue.decode is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4483) NegativeArraySizeException from scan thread right after minor compaction

2016-10-04 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15546255#comment-15546255
 ] 

Dave Marion commented on ACCUMULO-4483:
---

cc: [~ivan.bella]

Further analysis reveals that the underlying iterator is returning the same K,V 
twice and that MemValue.decode is modifying the source Value (which it should 
not). So, MemValue.decode() succeeds the first time and fails the second.

> NegativeArraySizeException from scan thread right after minor compaction
> 
>
> Key: ACCUMULO-4483
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4483
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 1.8.0
> Environment: Accumulo 1.8.0
> Java 1.8.0_72
>Reporter: Dave Marion
>Priority: Blocker
> Fix For: 1.8.1
>
>
> I have been getting NegativeArraySizeExceptions after upgrading to Accumulo 
> 1.8.0. I have tracked it down to 2 or more concurrent scans on a tablet that 
> has just undergone minor compaction. The minor compaction thread writes the 
> in-memory map to a local temporary rfile and tries to switch the current 
> iterators to use it instead of the native map. The iterator code in the scan 
> thread may also switch itself to use the local temporary rfile it if notices 
> it before the minor compaction threads performs the switch. This all works up 
> to this point. Shortly after the switch one of the iterator threads will get 
> a NegativeArraySizeException from:
> SourceSwitchingIterator.seek() -> SourceSwitchingIterator.readNext() -> 
> MemKeyConversionIterator.seek() -> MemKeyConversionIterator.getTopKeyValue() 
> ->MemValue.decode(). I added a bunch of logging to find that the length of 
> the byte array passed to MemValue.decode is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ACCUMULO-4484) Investigate SeekableByteArrayInputStream to determine if volatile being used correctly

2016-10-04 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4484:
-

 Summary: Investigate SeekableByteArrayInputStream to determine if 
volatile being used correctly
 Key: ACCUMULO-4484
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4484
 Project: Accumulo
  Issue Type: Bug
  Components: core
Affects Versions: 1.8.0
Reporter: Dave Marion


While looking at ACCUMULO-4483 I ran across SeekableByteArrayInputStream. I'm 
curious if the cur and max variables also need to be volatile as they are not 
read after the update to the volatile byte[].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4483) NegativeArraySizeException from scan thread right after minor compaction

2016-10-03 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15543390#comment-15543390
 ] 

Dave Marion commented on ACCUMULO-4483:
---

I should note that I did not see this in 1.7. This might also be related to 
ACCUMULO-4391.

> NegativeArraySizeException from scan thread right after minor compaction
> 
>
> Key: ACCUMULO-4483
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4483
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Affects Versions: 1.8.0
> Environment: Accumulo 1.8.0
> Java 1.8.0_72
>Reporter: Dave Marion
>Priority: Critical
>
> I have been getting NegativeArraySizeExceptions after upgrading to Accumulo 
> 1.8.0. I have tracked it down to 2 or more concurrent scans on a tablet that 
> has just undergone minor compaction. The minor compaction thread writes the 
> in-memory map to a local temporary rfile and tries to switch the current 
> iterators to use it instead of the native map. The iterator code in the scan 
> thread may also switch itself to use the local temporary rfile it if notices 
> it before the minor compaction threads performs the switch. This all works up 
> to this point. Shortly after the switch one of the iterator threads will get 
> a NegativeArraySizeException from:
> SourceSwitchingIterator.seek() -> SourceSwitchingIterator.readNext() -> 
> MemKeyConversionIterator.seek() -> MemKeyConversionIterator.getTopKeyValue() 
> ->MemValue.decode(). I added a bunch of logging to find that the length of 
> the byte array passed to MemValue.decode is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ACCUMULO-4483) NegativeArraySizeException from scan thread right after minor compaction

2016-10-03 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4483:
-

 Summary: NegativeArraySizeException from scan thread right after 
minor compaction
 Key: ACCUMULO-4483
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4483
 Project: Accumulo
  Issue Type: Bug
  Components: tserver
Affects Versions: 1.8.0
 Environment: Accumulo 1.8.0
Java 1.8.0_72
Reporter: Dave Marion
Priority: Critical


I have been getting NegativeArraySizeExceptions after upgrading to Accumulo 
1.8.0. I have tracked it down to 2 or more concurrent scans on a tablet that 
has just undergone minor compaction. The minor compaction thread writes the 
in-memory map to a local temporary rfile and tries to switch the current 
iterators to use it instead of the native map. The iterator code in the scan 
thread may also switch itself to use the local temporary rfile it if notices it 
before the minor compaction threads performs the switch. This all works up to 
this point. Shortly after the switch one of the iterator threads will get a 
NegativeArraySizeException from:
SourceSwitchingIterator.seek() -> SourceSwitchingIterator.readNext() -> 
MemKeyConversionIterator.seek() -> MemKeyConversionIterator.getTopKeyValue() 
->MemValue.decode(). I added a bunch of logging to find that the length of the 
byte array passed to MemValue.decode is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4453) Remove debug log spam from ScanDataSource

2016-09-10 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15479863#comment-15479863
 ] 

Dave Marion commented on ACCUMULO-4453:
---

Likely so that I could determine that the scan context was set correctly. 
Wouldn't say it's completely useless, maybe appropriate for trace logging 
instead.

> Remove debug log spam from ScanDataSource
> -
>
> Key: ACCUMULO-4453
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4453
> Project: Accumulo
>  Issue Type: Sub-task
>  Components: tserver
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
> Fix For: 2.0.0
>
>
> I happened to be looking at a TabletServer log file for an ExampleIT method 
> and just saw tons of 
> {noformat}
> 2016-09-09 17:18:08,617 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, loadIterators: true
> 2016-09-09 17:18:08,625 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, loadIterators: true
> 2016-09-09 17:18:08,630 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, loadIterators: true
> 2016-09-09 17:18:08,635 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, loadIterators: true
> 2016-09-09 17:18:08,640 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, loadIterators: true
> 2016-09-09 17:18:08,645 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, loadIterators: true
> 2016-09-09 17:18:08,651 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, loadIterators: true
> 2016-09-09 17:18:08,657 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, loadIterators: true
> 2016-09-09 17:18:08,662 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, loadIterators: true
> 2016-09-09 17:18:08,667 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, loadIterators: true
> 2016-09-09 17:18:08,673 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, loadIterators: true
> 2016-09-09 17:18:08,677 [tablet.ScanDataSource] DEBUG: new scan data source, 
> tablet: org.apache.accumulo.tserver.tablet.Tablet@58eb5149, options: [auths=, 
> batchTimeOut=9223372036854775807, context=null, columns=[], 
> interruptFlag=false, isolated=false, num=1000, samplerConfig=null], 
> interruptFlag: false, 

[jira] [Commented] (ACCUMULO-1495) Host multiple read-only versions of tablets for query performance

2016-09-09 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15478120#comment-15478120
 ] 

Dave Marion commented on ACCUMULO-1495:
---

Hey [~ctubbsii] : What about moving the major compactor process outside of the 
tablet server as an initial step?

> Host multiple read-only versions of tablets for query performance
> -
>
> Key: ACCUMULO-1495
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1495
> Project: Accumulo
>  Issue Type: New Feature
>Reporter: Christopher Tubbs
>
> It might be worth providing read-only views of tablets on multiple servers, 
> while writes go to a single authoritative server. This could improve query 
> performance in some cases.
> Some things to consider:
> # Consistency: can I read what I've just written?
> # Configuration: how many copies?
> # Configuration: how long between syncs?
> # Performance: locality, number of file replicas
> Variation:
> Permit multiple hosting only on tables that have been locked for writes 
> (useful with a strictly enforced clone-and-lock feature).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4376) Introduce a Builders for "data" classes

2016-08-27 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441771#comment-15441771
 ] 

Dave Marion commented on ACCUMULO-4376:
---

Nothing specific, just looking for proposed changes that may affect the builder 
classes. I am of the opinion that we should not deter someone from contributing 
a piece of work because it may change in the future. If it makes sense, doesn't 
conflict with future plans, and you are willing to do the work, then let's get 
it done. I am also of the opinion that we can also introduce API changes slowly 
over time instead of an all new API in a major release.

> Introduce a Builders for "data" classes
> ---
>
> Key: ACCUMULO-4376
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4376
> Project: Accumulo
>  Issue Type: Improvement
>  Components: client
>Reporter: Josh Elser
> Fix For: 2.0.0
>
>
> In looking at ACCUMULO-4375, I was a little frustrated at how we have 3x 
> constructors than Key really provides just to support {{byte[]}}, {{Text}}, 
> and {{CharSequence}} arguments. Additionally, the {{copy}} argument forces 
> the user to use the most specific (most arguments) constructor if they want 
> to avoid the copy. This makes constructing a Key from just a row while 
> avoiding a copy very pedantic.
> I think a KeyBuilder (or KeyFactory) class would be a big usability benefit 
> and reduce the amount of code that clients have to write to most efficiently 
> construct Keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4376) Introduce a Builders for "data" classes

2016-08-27 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441568#comment-15441568
 ] 

Dave Marion commented on ACCUMULO-4376:
---

Take a look at ACCUMULO-2589 and see if you can come up with something that 
might be future proof, at least with the information we have today.

> Introduce a Builders for "data" classes
> ---
>
> Key: ACCUMULO-4376
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4376
> Project: Accumulo
>  Issue Type: Improvement
>  Components: client
>Reporter: Josh Elser
> Fix For: 2.0.0
>
>
> In looking at ACCUMULO-4375, I was a little frustrated at how we have 3x 
> constructors than Key really provides just to support {{byte[]}}, {{Text}}, 
> and {{CharSequence}} arguments. Additionally, the {{copy}} argument forces 
> the user to use the most specific (most arguments) constructor if they want 
> to avoid the copy. This makes constructing a Key from just a row while 
> avoiding a copy very pedantic.
> I think a KeyBuilder (or KeyFactory) class would be a big usability benefit 
> and reduce the amount of code that clients have to write to most efficiently 
> construct Keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4376) Introduce a Builders for "data" classes

2016-08-27 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441564#comment-15441564
 ] 

Dave Marion commented on ACCUMULO-4376:
---

Nevermind, I should have searched first.

> Introduce a Builders for "data" classes
> ---
>
> Key: ACCUMULO-4376
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4376
> Project: Accumulo
>  Issue Type: Improvement
>  Components: client
>Reporter: Josh Elser
> Fix For: 2.0.0
>
>
> In looking at ACCUMULO-4375, I was a little frustrated at how we have 3x 
> constructors than Key really provides just to support {{byte[]}}, {{Text}}, 
> and {{CharSequence}} arguments. Additionally, the {{copy}} argument forces 
> the user to use the most specific (most arguments) constructor if they want 
> to avoid the copy. This makes constructing a Key from just a row while 
> avoiding a copy very pedantic.
> I think a KeyBuilder (or KeyFactory) class would be a big usability benefit 
> and reduce the amount of code that clients have to write to most efficiently 
> construct Keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4376) Introduce a Builders for "data" classes

2016-08-27 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441560#comment-15441560
 ] 

Dave Marion commented on ACCUMULO-4376:
---

Is the API design for 2.0 documented somewhere?

> Introduce a Builders for "data" classes
> ---
>
> Key: ACCUMULO-4376
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4376
> Project: Accumulo
>  Issue Type: Improvement
>  Components: client
>Reporter: Josh Elser
> Fix For: 2.0.0
>
>
> In looking at ACCUMULO-4375, I was a little frustrated at how we have 3x 
> constructors than Key really provides just to support {{byte[]}}, {{Text}}, 
> and {{CharSequence}} arguments. Additionally, the {{copy}} argument forces 
> the user to use the most specific (most arguments) constructor if they want 
> to avoid the copy. This makes constructing a Key from just a row while 
> avoiding a copy very pedantic.
> I think a KeyBuilder (or KeyFactory) class would be a big usability benefit 
> and reduce the amount of code that clients have to write to most efficiently 
> construct Keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ACCUMULO-4163) ZooZap should deal with /tracers being missing

2016-08-23 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4163.
---
Resolution: Fixed

> ZooZap should deal with /tracers being missing
> --
>
> Key: ACCUMULO-4163
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4163
> Project: Accumulo
>  Issue Type: Bug
>  Components: zookeeper
>Affects Versions: 1.7.1
>Reporter: Josh Elser
>Assignee: Dave Marion
>Priority: Minor
> Fix For: 1.6.6, 1.7.3, 1.8.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {noformat}
> 2016-03-14 16:57:15,316 [util.ZooZap] ERROR: Exception running zapDirectory.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /tracers
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1532)
>   at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1560)
>   at 
> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:151)
>   at org.apache.accumulo.server.util.ZooZap.zapDirectory(ZooZap.java:117)
>   at org.apache.accumulo.server.util.ZooZap.main(ZooZap.java:110)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.accumulo.start.Main$2.run(Main.java:130)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This shouldn't be an error that is propagated to the user to see. We should 
> know that it's ok if '/tracers' doesn't exist and move on silently



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ACCUMULO-4163) ZooZap should deal with /tracers being missing

2016-08-23 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion updated ACCUMULO-4163:
--
Fix Version/s: (was: 1.8.1)
   1.8.0

> ZooZap should deal with /tracers being missing
> --
>
> Key: ACCUMULO-4163
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4163
> Project: Accumulo
>  Issue Type: Bug
>  Components: zookeeper
>Affects Versions: 1.7.1
>Reporter: Josh Elser
>Assignee: Dave Marion
>Priority: Minor
> Fix For: 1.6.6, 1.7.3, 1.8.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {noformat}
> 2016-03-14 16:57:15,316 [util.ZooZap] ERROR: Exception running zapDirectory.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /tracers
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1532)
>   at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1560)
>   at 
> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:151)
>   at org.apache.accumulo.server.util.ZooZap.zapDirectory(ZooZap.java:117)
>   at org.apache.accumulo.server.util.ZooZap.main(ZooZap.java:110)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.accumulo.start.Main$2.run(Main.java:130)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This shouldn't be an error that is propagated to the user to see. We should 
> know that it's ok if '/tracers' doesn't exist and move on silently



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (ACCUMULO-4163) ZooZap should deal with /tracers being missing

2016-08-22 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion reassigned ACCUMULO-4163:
-

Assignee: Dave Marion

> ZooZap should deal with /tracers being missing
> --
>
> Key: ACCUMULO-4163
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4163
> Project: Accumulo
>  Issue Type: Bug
>  Components: zookeeper
>Affects Versions: 1.7.1
>Reporter: Josh Elser
>Assignee: Dave Marion
>Priority: Minor
> Fix For: 1.6.6, 1.7.3, 1.8.1
>
>
> {noformat}
> 2016-03-14 16:57:15,316 [util.ZooZap] ERROR: Exception running zapDirectory.
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /tracers
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1532)
>   at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1560)
>   at 
> org.apache.accumulo.fate.zookeeper.ZooReader.getChildren(ZooReader.java:151)
>   at org.apache.accumulo.server.util.ZooZap.zapDirectory(ZooZap.java:117)
>   at org.apache.accumulo.server.util.ZooZap.main(ZooZap.java:110)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.accumulo.start.Main$2.run(Main.java:130)
>   at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This shouldn't be an error that is propagated to the user to see. We should 
> know that it's ok if '/tracers' doesn't exist and move on silently



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4315) Log message during bulk import is misleading "Found candidate Volumes for Path but none of the Volumes are valid for the candidates [path]"

2016-08-16 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15423220#comment-15423220
 ] 

Dave Marion commented on ACCUMULO-4315:
---

+1 to deleting the log message and associated comment.

> Log message during bulk import is misleading "Found candidate Volumes for 
> Path but none of the Volumes are valid for the candidates [path]"
> ---
>
> Key: ACCUMULO-4315
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4315
> Project: Accumulo
>  Issue Type: Improvement
>Affects Versions: 1.6.5, 1.7.1, 1.8.0
>Reporter: Ed Coleman
>Priority: Trivial
> Fix For: 1.6.6, 1.7.3, 1.8.1
>
>
> During bulk ingest there are two debug level messages generated in the master 
> debug log, one looks like a problem and then the next message indicates that 
> the file was moved successfully to the target tablet directory. 
> The message in org.apache.accumulo.server.fs.VolumeManager is "Found 
> candidate Volumes for Path but none of the Volumes are valid for the 
> candidates [path]"
> This seems to occur because the directory of the "source" bulk ingest is not 
> in the Accumulo instance.volume (on purpose) and is probably the general case 
> for bulk ingest.
> The message could be reworded so that it does not appear that there is an 
> issue with bulk ingest and depending where else the condition could occur, 
> lower the level to trace if there are other messages more descriptive of 
> operation that is potentially having the issue with file outside of the 
> Accumulo instance volume.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ACCUMULO-4408) Improve the agitator

2016-08-16 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4408:
-

 Summary: Improve the agitator
 Key: ACCUMULO-4408
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4408
 Project: Accumulo
  Issue Type: Task
Reporter: Dave Marion


It looks like code is duplicated in the perl scripts. We should be able to 
reduce this and add some host specific actions regarding cpu and network like 
Netflix's ChaosMonkey does. See 
https://github.com/Netflix/SimianArmy/tree/master/src/main/resources/scripts



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-3830) Repeated attempts to bind already bound port in tserver

2016-08-16 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15423082#comment-15423082
 ] 

Dave Marion commented on ACCUMULO-3830:
---

This may be OBE with ACCUMULO-4331.

> Repeated attempts to bind already bound port in tserver
> ---
>
> Key: ACCUMULO-3830
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3830
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
> Fix For: 1.7.3, 1.8.1
>
>
> {noformat}
> 2015-05-19 17:56:00,792 [client.ClientConfiguration] WARN : Client 
> configuration constructed with a Configuration that did not have list 
> delimiter disabled or overridden, multi-valued config properties may be 
> unavailable
> 2015-05-19 17:56:00,793 [rpc.TServerUtils] DEBUG: Instantiating SASL Thrift 
> server
> 2015-05-19 17:56:00,793 [rpc.TServerUtils] INFO : Creating SASL thread pool 
> thrift server on listening on hostname:9997
> 2015-05-19 17:56:00,793 [rpc.TServerUtils] ERROR: Unable to start TServer
> org.apache.thrift.transport.TTransportException: Could not create 
> ServerSocket on address 0.0.0.0/0.0.0.0:9997.
> at 
> org.apache.thrift.transport.TServerSocket.(TServerSocket.java:93)
> at 
> org.apache.thrift.transport.TServerSocket.(TServerSocket.java:75)
> at 
> org.apache.accumulo.server.rpc.TServerUtils.createSaslThreadPoolServer(TServerUtils.java:388)
> at 
> org.apache.accumulo.server.rpc.TServerUtils.startTServer(TServerUtils.java:498)
> at 
> org.apache.accumulo.server.rpc.TServerUtils.startTServer(TServerUtils.java:472)
> at 
> org.apache.accumulo.server.rpc.TServerUtils.startServer(TServerUtils.java:151)
> at 
> org.apache.accumulo.tserver.TabletServer.startServer(TabletServer.java:2266)
> at 
> org.apache.accumulo.tserver.TabletServer.startTabletClientService(TabletServer.java:2314)
> at 
> org.apache.accumulo.tserver.TabletServer.run(TabletServer.java:2450)
> at 
> org.apache.accumulo.tserver.TabletServer$9.run(TabletServer.java:2963)
> at 
> org.apache.accumulo.tserver.TabletServer$9.run(TabletServer.java:2960)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.accumulo.tserver.TabletServer.main(TabletServer.java:2960)
> at 
> org.apache.accumulo.tserver.TServerExecutable.execute(TServerExecutable.java:33)
> at org.apache.accumulo.start.Main$1.run(Main.java:93)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> I'm seeing loops of 100 of the above when a tabletserver tries to start on a 
> node in which it's already running. I think this is the port-shift code that 
> tries to jump around and choose a different value. When we have a static 
> value, trying to jump around is rather useless.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ACCUMULO-4379) Clarify the difference between Hadoop and Accumulo native libs in bootstrap_config.sh script

2016-08-16 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4379.
---
Resolution: Fixed

Made messages a little more clear.

> Clarify the difference between Hadoop and Accumulo native libs in 
> bootstrap_config.sh script
> 
>
> Key: ACCUMULO-4379
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4379
> Project: Accumulo
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Lindsey Kuper
>Assignee: Dave Marion
>Priority: Trivial
> Fix For: 1.6.6, 1.7.3, 1.8.1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Lindsey Kruper on serverfault expressed some confusion in 
> bootstrap_config.sh's warning messages in 
> http://serverfault.com/questions/790154/what-should-hadoop-prefix-be-for-accumulo-installation/790255
> Rightfully so, the script's warnings don't differentiate between what is a 
> Hadoop setup issue and what is Accumulo setup.
> We can easily improve this with better error messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (ACCUMULO-4379) Clarify the difference between Hadoop and Accumulo native libs in bootstrap_config.sh script

2016-08-16 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion reassigned ACCUMULO-4379:
-

Assignee: Dave Marion

> Clarify the difference between Hadoop and Accumulo native libs in 
> bootstrap_config.sh script
> 
>
> Key: ACCUMULO-4379
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4379
> Project: Accumulo
>  Issue Type: Improvement
>  Components: scripts
>Reporter: Lindsey Kuper
>Assignee: Dave Marion
>Priority: Trivial
> Fix For: 1.6.6, 1.7.3, 1.8.1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Lindsey Kruper on serverfault expressed some confusion in 
> bootstrap_config.sh's warning messages in 
> http://serverfault.com/questions/790154/what-should-hadoop-prefix-be-for-accumulo-installation/790255
> Rightfully so, the script's warnings don't differentiate between what is a 
> Hadoop setup issue and what is Accumulo setup.
> We can easily improve this with better error messages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ACCUMULO-4307) semver compliance maven plugin

2016-08-16 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion updated ACCUMULO-4307:
--
Attachment: ACCUMULO-4307-1.patch

I just updated this and ran it on the 1.8 branch. Drop the changes into the 
core pom.xml file, and run 'mvn clean package -DskipTests -Psemver-compliance'. 
I think we will need to drop this into the MAC pom also.

> semver compliance maven plugin
> --
>
> Key: ACCUMULO-4307
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4307
> Project: Accumulo
>  Issue Type: New Feature
>  Components: build
>Reporter: Dave Marion
>Assignee: Christopher Tubbs
> Fix For: 1.6.6, 1.7.3, 1.8.1
>
> Attachments: ACCUMULO-4307-1.patch
>
>
> Found https://siom79.github.io/japicmp/MavenPlugin.html today and tested it 
> out. Was wondering what thoughts are for incorporating this into the build. I 
> tested it on 1.6.6-SNAPSHOT by dropping the following into the test pom file:
> {noformat}
> 
>   semver-compliance
>   
> 
>   
> com.github.siom79.japicmp
> japicmp-maven-plugin
> 0.7.2
> 
>   
> client compliance
> 
>   cmp
> 
> verify
> 
>   
> 
>   org.apache.accumulo
>   accumulo-core
>   1.6.5
>   jar
> 
>   
>   
> 
>   org.apache.accumulo
>   accumulo-core
>   ${project.version}
>   jar
> 
>   
>   
> 
>   org.apache.accumulo.core.client
>   org.apache.accumulo.core.data
>   org.apache.accumulo.core.security
> 
> 
>   *crypto*
>   *impl*
>   *thrift*
> 
> protected
> 
> breakBuildBasedOnSemanticVersioning>true
> true
>   
> 
>   
> 
>   
> 
>   
> 
> {noformat}
>  I tried getting the previous release version number using the 
> build-helper-maven-plugin, but it found the wrong version. If we use this we 
> would also have to include an execution for minicluster and determine whether 
> or not we want to use the reporting feature of the plugin.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-1013) Integrate with a scalable monitoring tool

2016-08-16 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422711#comment-15422711
 ] 

Dave Marion commented on ACCUMULO-1013:
---

FWIW, I added a StatsDSink to the Hadoop Metrics2 framework in HADOOP-12360. 
One potential solution would be to [configure | 
https://nationalsecurityagency.github.io/timely/docs/#collecting-metrics-from-hadoop-and-accumulo
 ] Hadoop and Accumulo to emit their metrics to a CollectD service and run a 
[Timely| https://nationalsecurityagency.github.io/timely/] server.

> Integrate with a scalable monitoring tool
> -
>
> Key: ACCUMULO-1013
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1013
> Project: Accumulo
>  Issue Type: Sub-task
>  Components: monitor
>Reporter: Eric Newton
>Priority: Minor
>  Labels: gsoc2013, mentor
> Fix For: 1.9.0
>
>
> The monitor is awesome.  It should die.
> I'm going to move other monitor tickets under this one (if I can), and create 
> some requirement tickets.
> We would be better off putting our weight behind an existing monitoring 
> program which can scale, if one exists.
> Hopefully we can combine tracing efforts and have a nicer distributed 
> trace-based tool, too.
> For display functionality, lots of possibilities: Graphite, Cubism.js, D3.js 
> (really, any number of really slick Javascript graphing libraries). For log 
> collection, any number of distributed log management services out there too 
> can serve as inspiration for functionality: statsd, logstash, cacti/rrdtool.
> Currently all of Accumulo monitoring information is exposed via JMX; a nice 
> balance could be found leveraging the existing monitoring capabilities with 
> JMXTrans (or equivalent) and applying a new GUI.
> Familiarity with Java and JMX would be ideal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ACCUMULO-4406) .out and .err files should have same naming convention as log files

2016-08-15 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4406:
-

 Summary: .out and .err files should have same naming convention as 
log files
 Key: ACCUMULO-4406
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4406
 Project: Accumulo
  Issue Type: Improvement
  Components: scripts
Reporter: Dave Marion
Assignee: Dave Marion
 Fix For: 1.8.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4396) Remove dependency on MiniDFS from MiniAccumulo

2016-08-03 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406153#comment-15406153
 ] 

Dave Marion commented on ACCUMULO-4396:
---

It would also be nice to be able to restart MiniAccumulo with the files in the 
same directory and not lose information.

> Remove dependency on MiniDFS from MiniAccumulo
> --
>
> Key: ACCUMULO-4396
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4396
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
>
> Accumulo uses MiniAccumulo to test itself.  Some of Accumulo's own test use 
> MiniDFS.  This functionality is not available to users of Accumulo, but the 
> dependency on MiniDFS is forced on users (which brings a long a lot of other 
> hadoop server side dependencies).  Accumulo's internal testing code should be 
> refactored to move MiniDFS out of MiniAccumulo's implementation removing the 
> dependency.  Test that need MiniDFS should set it up and configure 
> MiniAccumulo to use it, utility code could be provided to do this.   Then the 
> dependency on MiniDFS could move to Accumulo's test code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ACCUMULO-4392) Expose metrics2 JVM metrics

2016-07-28 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4392:
-

 Summary: Expose metrics2 JVM metrics
 Key: ACCUMULO-4392
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4392
 Project: Accumulo
  Issue Type: Improvement
Reporter: Dave Marion
 Fix For: 1.9.0


The Hadoop services create a JVMMetrics instance in their server metrics 
objects (e.g. DataNodeMetrics) to export internal JVM metrics. It does not 
appear that this is happening for the Accumulo server processes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4128) ShellServerIT#scansWithClassLoaderContext fails consistently

2016-07-20 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385735#comment-15385735
 ] 

Dave Marion commented on ACCUMULO-4128:
---

Test changed and committed.

> ShellServerIT#scansWithClassLoaderContext fails consistently
> 
>
> Key: ACCUMULO-4128
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4128
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Eric Newton
>Assignee: Dave Marion
> Fix For: 2.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4128) ShellServerIT#scansWithClassLoaderContext fails consistently

2016-07-20 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15385712#comment-15385712
 ] 

Dave Marion commented on ACCUMULO-4128:
---

I'll change the test to use setshelliter. Thanks [~ctubbsii].

> ShellServerIT#scansWithClassLoaderContext fails consistently
> 
>
> Key: ACCUMULO-4128
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4128
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Eric Newton
>Assignee: Dave Marion
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4128) ShellServerIT#scansWithClassLoaderContext fails consistently

2016-07-19 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384381#comment-15384381
 ] 

Dave Marion commented on ACCUMULO-4128:
---

bq. By the letter of the law, shell commands are presently covered by semver

They are? or are they not? If they are covered, then they have to be deprecated 
in a prior release and then removed in a major release. If not, then anything 
is fair game.

bq. Did you happen to do any digging to figure out what commit/issue changed 
this in master?

I did not.

> ShellServerIT#scansWithClassLoaderContext fails consistently
> 
>
> Key: ACCUMULO-4128
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4128
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Eric Newton
>Assignee: Dave Marion
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ACCUMULO-4380) SetScanIter command in shell does not exist in 2.0, needs to be marked deprecated

2016-07-19 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4380:
-

 Summary: SetScanIter command in shell does not exist in 2.0, needs 
to be marked deprecated
 Key: ACCUMULO-4380
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4380
 Project: Accumulo
  Issue Type: Task
  Components: shell
Reporter: Dave Marion
Priority: Blocker
 Fix For: 1.8.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ACCUMULO-4128) ShellServerIT#scansWithClassLoaderContext fails consistently

2016-07-19 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4128.
---
Resolution: Fixed

> ShellServerIT#scansWithClassLoaderContext fails consistently
> 
>
> Key: ACCUMULO-4128
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4128
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Eric Newton
>Assignee: Dave Marion
> Fix For: 2.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (ACCUMULO-4128) ShellServerIT#scansWithClassLoaderContext fails consistently

2016-07-19 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion reassigned ACCUMULO-4128:
-

Assignee: Dave Marion

> ShellServerIT#scansWithClassLoaderContext fails consistently
> 
>
> Key: ACCUMULO-4128
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4128
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Eric Newton
>Assignee: Dave Marion
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4128) ShellServerIT#scansWithClassLoaderContext fails consistently

2016-07-19 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15384089#comment-15384089
 ] 

Dave Marion commented on ACCUMULO-4128:
---

cc: [~ctubbsii]

Ok, I think I found the problem. The shell command `setscaniter` appears to be 
non-existent in the master branch. The test was not checking the output of the 
shell to confirm that the iterator was applied. More concerning is that 
SetScanIterCommand is not marked deprecated in the 1.8 branch. The new command 
looks to be `setiter -scan`.

> ShellServerIT#scansWithClassLoaderContext fails consistently
> 
>
> Key: ACCUMULO-4128
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4128
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Eric Newton
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4128) ShellServerIT#scansWithClassLoaderContext fails consistently

2016-07-18 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383303#comment-15383303
 ] 

Dave Marion commented on ACCUMULO-4128:
---

Thanks, I'll give it a shot in the morning.

> ShellServerIT#scansWithClassLoaderContext fails consistently
> 
>
> Key: ACCUMULO-4128
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4128
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Eric Newton
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (ACCUMULO-4128) ShellServerIT#scansWithClassLoaderContext fails consistently

2016-07-18 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382544#comment-15382544
 ] 

Dave Marion edited comment on ACCUMULO-4128 at 7/18/16 4:37 PM:


I think I know what's going on, but I don't know where to look to fix it. I put 
in a bunch of debug and found that the classloaders that are loading the 
iterators for the test are the same (same instance). Furthermore, they are not 
VFS classloader or URL classloaders, they are of type 
sun.misc.Launcher$AppClassLoader, which (according to [1]) is the class loader 
for the application (in this case the TabletServer). Is there a way to print 
out the classpath of the java process that is started for MAC?

[1] https://blogs.oracle.com/sundararajan/entry/understanding_java_class_loading



was (Author: dlmarion):
I think I know what's going on, but I don't know where to look to fix it. I put 
in a bunch of debug and found that the classloaders that are loading the 
iterators for the test are the same (same instance). Furthermore, they are not 
VFS classloader or URL classloaders, then are of type 
sun.misc.Launcher$AppClassLoader, which (according to [1]) is the class loader 
for the application (in this case the TabletServer). Is there a way to print 
out the classpath of the java process that is started for MAC?

[1] https://blogs.oracle.com/sundararajan/entry/understanding_java_class_loading


> ShellServerIT#scansWithClassLoaderContext fails consistently
> 
>
> Key: ACCUMULO-4128
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4128
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Eric Newton
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4128) ShellServerIT#scansWithClassLoaderContext fails consistently

2016-07-18 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382544#comment-15382544
 ] 

Dave Marion commented on ACCUMULO-4128:
---

I think I know what's going on, but I don't know where to look to fix it. I put 
in a bunch of debug and found that the classloaders that are loading the 
iterators for the test are the same (same instance). Furthermore, they are not 
VFS classloader or URL classloaders, then are of type 
sun.misc.Launcher$AppClassLoader, which (according to [1]) is the class loader 
for the application (in this case the TabletServer). Is there a way to print 
out the classpath of the java process that is started for MAC?

[1] https://blogs.oracle.com/sundararajan/entry/understanding_java_class_loading


> ShellServerIT#scansWithClassLoaderContext fails consistently
> 
>
> Key: ACCUMULO-4128
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4128
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Eric Newton
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4128) ShellServerIT#scansWithClassLoaderContext fails consistently

2016-07-18 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15382136#comment-15382136
 ] 

Dave Marion commented on ACCUMULO-4128:
---

Which branch? Master, or is this failing in 1.8 also?

> ShellServerIT#scansWithClassLoaderContext fails consistently
> 
>
> Key: ACCUMULO-4128
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4128
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Eric Newton
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4375) Add missing Key constructors taking array of bytes as argument

2016-07-17 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15381337#comment-15381337
 ] 

Dave Marion commented on ACCUMULO-4375:
---

Suggest opening a separate issue.

> Add missing Key constructors taking array of bytes as argument
> --
>
> Key: ACCUMULO-4375
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4375
> Project: Accumulo
>  Issue Type: Improvement
>  Components: core
>Reporter: Mario Pastorelli
>  Labels: newbie
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In my company we use {{Key}} built directly from {{byte[]}} instead of 
> {{Text}}. Currently {{Key}} has many constructors working with {{Text}} and 
> only few working with {{byte[]}}. You can still use the {{Text}}-based 
> constructor to create a {{Key}} from a {{byte[]}} by wrapping it into a 
> {{Text}}, but this requires to box a {{byte[]}} into {{Text}} without any 
> good reason.
> I propose to add the missing {{byte[]}}-based {{Key}} constructors, which are:
> {code:java}
> Key(byte[] row)
> Key(byte[] row, long ts)
> Key(byte[] row, byte[] cf)
> Key(byte[] row, byte[] cf, byte[] cq)
> Key(byte[] row, byte[] cf, byte[] cq, byte[] cv)
> Key(byte[] row, byte[] cf, byte[] cq, long ts)
> Key(byte[] row, byte[] cf, byte[] cq, ColumnVisibility cv, long ts)
> {code}
> The new constructor should behave like the {{Text}}-based counterpart, for 
> instance:
> {code:java}
> byte[] row = new byte[] {0};
> assertEquals(new Key(row), new Key(new Text(row)));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4375) Add missing Key constructors taking array of bytes as argument

2016-07-16 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15380973#comment-15380973
 ] 

Dave Marion commented on ACCUMULO-4375:
---

What version are you using? I see some Key constructors in the 1.7 branch that 
take byte[].

> Add missing Key constructors taking array of bytes as argument
> --
>
> Key: ACCUMULO-4375
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4375
> Project: Accumulo
>  Issue Type: Improvement
>  Components: core
>Reporter: Mario Pastorelli
>  Labels: newbie
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In my company we use {{Key}} built directly from {{byte[]}} instead of 
> {{Text}}. Currently {{Key}} has many constructors working with {{Text}} and 
> only few working with {{byte[]}}. You can still use the {{Text}}-based 
> constructor to create a {{Key}} from a {{byte[]}} by wrapping it into a 
> {{Text}}, but this requires to box a {{byte[]}} into {{Text}} without any 
> good reason.
> I propose to add the missing {{byte[]}}-based {{Key}} constructors, which are:
> {code:java}
> Key(byte[] row)
> Key(byte[] row, long ts)
> Key(byte[] row, byte[] cf)
> Key(byte[] row, byte[] cf, byte[] cq)
> Key(byte[] row, byte[] cf, byte[] cq, byte[] cv)
> Key(byte[] row, byte[] cf, byte[] cq, long ts)
> Key(byte[] row, byte[] cf, byte[] cq, ColumnVisibility cv, long ts)
> {code}
> The new constructor should behave like the {{Text}}-based counterpart, for 
> instance:
> {code:java}
> byte[] row = new byte[] {0};
> assertEquals(new Key(row), new Key(new Text(row)));
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4128) ShellServerIT#scansWithClassLoaderContext fails consistently

2016-07-10 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15370111#comment-15370111
 ] 

Dave Marion commented on ACCUMULO-4128:
---

I will try to look at it this week.

> ShellServerIT#scansWithClassLoaderContext fails consistently
> 
>
> Key: ACCUMULO-4128
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4128
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Eric Newton
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4360) Make cancelling user initiated compactions easier

2016-07-08 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367701#comment-15367701
 ] 

Dave Marion commented on ACCUMULO-4360:
---

No, thanks for the information.

> Make cancelling user initiated compactions easier
> -
>
> Key: ACCUMULO-4360
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4360
> Project: Accumulo
>  Issue Type: Improvement
>  Components: fate, master, tserver
>Affects Versions: 1.7.2
>Reporter: Dave Marion
>
> To stop a user initiated compaction, you need to:
> 1. stop the master
> 2. remove the fate transaction
> 3. restart the tserver where the compaction is queued
> 4. restart the master
> If the compaction was not queued on the tablet server, but queued on the 
> master, cancelling the compaction would cause less churn on a cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ACCUMULO-4360) Make cancelling user initiated compactions easier

2016-07-08 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4360.
---
Resolution: Not A Problem

> Make cancelling user initiated compactions easier
> -
>
> Key: ACCUMULO-4360
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4360
> Project: Accumulo
>  Issue Type: Improvement
>  Components: fate, master, tserver
>Affects Versions: 1.7.2
>Reporter: Dave Marion
>
> To stop a user initiated compaction, you need to:
> 1. stop the master
> 2. remove the fate transaction
> 3. restart the tserver where the compaction is queued
> 4. restart the master
> If the compaction was not queued on the tablet server, but queued on the 
> master, cancelling the compaction would cause less churn on a cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ACCUMULO-4360) Make cancelling user initiated compactions easier

2016-07-08 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4360:
-

 Summary: Make cancelling user initiated compactions easier
 Key: ACCUMULO-4360
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4360
 Project: Accumulo
  Issue Type: Improvement
  Components: fate, master, tserver
Affects Versions: 1.7.2
Reporter: Dave Marion


To stop a user initiated compaction, you need to:

1. stop the master
2. remove the fate transaction
3. restart the tserver where the compaction is queued
4. restart the master

If the compaction was not queued on the tablet server, but queued on the 
master, cancelling the compaction would cause less churn on a cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4342) Admin#stopTabletServer(ClientContext, List, boolean) doesn't work with dynamic ports (0)

2016-06-23 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15347493#comment-15347493
 ] 

Dave Marion commented on ACCUMULO-4342:
---

If I remember correctly, this has been around for a while and not associated 
with my recent changes.

> Admin#stopTabletServer(ClientContext, List, boolean) doesn't work 
> with dynamic ports (0)
> 
>
> Key: ACCUMULO-4342
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4342
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Josh Elser
> Fix For: 1.6.6, 1.7.3, 1.8.1
>
>
> Noticed in Dave's changeset from ACCUMULO-4331 that the logic to stop the 
> tabletservers when invoking `admin stop`won't work when the ports are set to 
> '0' (bind a free port in the ephemeral range).
> Looks like we'd have to do a few things to make this work properly:
> 1. If the tserver client port is '0' and no port is provided in the hostname 
> to `admin stop`, we should look at ZK to stop all tservers on that host.
> 2. If the tserver client port is '0' and a port is provided in the hostname 
> to `admin stop`, we should try to just stop the tserver with the given port 
> on that host.
> Would have to look more closely at the code to verify this all makes sense.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4353) Stabilize tablet assignment during transient failure

2016-06-23 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15347356#comment-15347356
 ] 

Dave Marion commented on ACCUMULO-4353:
---

bq. there's always a concern of technical debt (in terms of complexity)

Fair enough.

bq. If this is really about trying to make rolling-restarts better.

Not sure, I think it has to do with quick unplanned restarts, but it would be 
good to clear that up. I see a rolling restart being an intentional, planned 
activity. I think (based on "failure" in the title) this is for the 
unintentional, unplanned short duration outage (e.g. losing connectivity to a 
rack for a short time) where the administrator wants to bring the failed tablet 
servers up as soon as possible. 

> Stabilize tablet assignment during transient failure
> 
>
> Key: ACCUMULO-4353
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4353
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Shawn Walker
>Assignee: Shawn Walker
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a tablet server dies, Accumulo attempts to reassign the tablets it was 
> hosting as quickly as possible to maintain availability.  If multiple tablet 
> servers die in quick succession, such as from a rolling restart of the 
> Accumulo cluster or a network partition, this behavior can cause a storm of 
> reassignment and rebalancing, placing significant load on the master.
> To avert such load, Accumulo should be capable of maintaining a steady tablet 
> assignment state in the face of transient tablet server loss.  Instead of 
> reassigning tablets as quickly as possible, Accumulo should be await the 
> return of a temporarily downed tablet server (for some configurable duration) 
> before assigning its tablets to other tablet servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4353) Stabilize tablet assignment during transient failure

2016-06-23 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15347329#comment-15347329
 ] 

Dave Marion commented on ACCUMULO-4353:
---

bq. Can you expand on this some more? Given that assignment is arguably the 
most important thing for the Master to do, why are we concerned about letting 
the master do that as fast as it can (for the aforementioned reason)? Do we 
need to come up with a more efficient way for the master to handle the 
reassignment of many tablets?

Reading through this, and bringing some first-hand experience, I don't think 
the issue is the the Master assigning tablets. It's the issue of tablet servers 
that are down for a short period of time. When a tserver goes down, the Master 
re-assigns the tablets. When the tserver comes back up, it goes through several 
rounds of balancing which could take a long time and cause a lot of churn.

bq. I'm a little worried about this as a configuration knob – I feel like it 
kind of goes against the highly-available distributed database which we expect 
Accumulo to be. When we don't reassign tablets fast, that is a direct lack of 
availability for clients to read data.

I don't see any harm done here as long as the default behavior is what happens 
today. Allowing an administrator to choose to delay tablet reassignment may not 
fit most use cases, but it could fit some.

My 2 cents.

> Stabilize tablet assignment during transient failure
> 
>
> Key: ACCUMULO-4353
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4353
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Shawn Walker
>Assignee: Shawn Walker
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a tablet server dies, Accumulo attempts to reassign the tablets it was 
> hosting as quickly as possible to maintain availability.  If multiple tablet 
> servers die in quick succession, such as from a rolling restart of the 
> Accumulo cluster or a network partition, this behavior can cause a storm of 
> reassignment and rebalancing, placing significant load on the master.
> To avert such load, Accumulo should be capable of maintaining a steady tablet 
> assignment state in the face of transient tablet server loss.  Instead of 
> reassigning tablets as quickly as possible, Accumulo should be await the 
> return of a temporarily downed tablet server (for some configurable duration) 
> before assigning its tablets to other tablet servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4350) ShellServerIT fails to load RowSampler class

2016-06-21 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15341723#comment-15341723
 ] 

Dave Marion commented on ACCUMULO-4350:
---

Results are below. I saw the same error you did in the logs, but I don't think 
its related. I saw a ton of stuck on IO warnings at two minutes, and these fail 
after 60 seconds:

{noformat}
Tests in error: 
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.createTableWithProperties:836 » TestTimedOut test timed out 
afte...
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleterows:1142 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.egrep:420 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.exporttableImporttable:358 » TestTimedOut test timed out after 
6...
  ShellServerIT.formatter:1198 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.iter:535 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.maxrow:1417 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.merge:1429 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.namespaces:1657 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.testCompactionSelection:990 » TestTimedOut test timed out after 
...
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.testCompactions:873 » TestTimedOut test timed out after 60 
secon...
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.testScanScample:1033 » TestTimedOut test timed out after 60 
seco...
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.user:484 » TestTimedOut test timed out after 60 seconds
  ShellServerIT.deleteTables:291 » TestTimedOut test timed out after 60 seconds

Tests run: 49, Failures: 0, Errors: 47, Skipped: 0
{noformat}

I think the ShellServerIT.testScanScample test is failing because the 
accumulo-core.jar file is not copied into 
target/mini-tests/org.apache.accumulo.harness.SharedMiniClusterBase_*/lib/ext 
directory.

> ShellServerIT fails to load RowSampler class
> 
>
> Key: ACCUMULO-4350
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4350
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Josh Elser
>Priority: Critical
> Fix For: 1.9.0
>
>
> Running {{JAVA_HOME=$(/usr/libexec/java_home -v1.8) mvn clean verify 
> -Dit.test=ShellServerIT -Dtest=foobar -DfailIfNoTests=false}}:
> 

[jira] [Commented] (ACCUMULO-4350) ShellServerIT fails to load RowSampler class

2016-06-21 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15341674#comment-15341674
 ] 

Dave Marion commented on ACCUMULO-4350:
---

I'm running it right now with Java 7, seeing tons of the same stuck on IO 
warnings.

> ShellServerIT fails to load RowSampler class
> 
>
> Key: ACCUMULO-4350
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4350
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Josh Elser
>Priority: Critical
> Fix For: 1.9.0
>
>
> Running {{JAVA_HOME=$(/usr/libexec/java_home -v1.8) mvn clean verify 
> -Dit.test=ShellServerIT -Dtest=foobar -DfailIfNoTests=false}}:
> I can see the following which is causing ShellServerIT to never exit:
> {noformat}
> 2016-06-21 00:48:23,675 [impl.ThriftTransportPool] WARN : Thread 
> "Time-limited test" stuck on IO to host.domain:61175 (0) for at least 120324 
> ms
> {noformat}
> Looking in the TabletServer logs for MAC
> {noformat}
> 2016-06-21 00:48:50,019 [tablet.Compactor] ERROR: 
> java.lang.ClassNotFoundException: org.apache.accumulo.core.sample.RowSampler
> java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.accumulo.core.sample.RowSampler
>   at 
> org.apache.accumulo.core.sample.impl.SamplerFactory.newSampler(SamplerFactory.java:45)
>   at 
> org.apache.accumulo.core.file.rfile.RFileOperations.openWriter(RFileOperations.java:91)
>   at 
> org.apache.accumulo.core.file.DispatchingFileFactory.openWriter(DispatchingFileFactory.java:74)
>   at 
> org.apache.accumulo.core.file.FileOperations$OpenWriterOperation.build(FileOperations.java:331)
>   at org.apache.accumulo.tserver.tablet.Compactor.call(Compactor.java:201)
>   at 
> org.apache.accumulo.tserver.tablet.Tablet._majorCompact(Tablet.java:1850)
>   at 
> org.apache.accumulo.tserver.tablet.Tablet.majorCompact(Tablet.java:1967)
>   at 
> org.apache.accumulo.tserver.tablet.CompactionRunner.run(CompactionRunner.java:44)
>   at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.accumulo.core.sample.RowSampler
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:178)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.accumulo.start.classloader.vfs.AccumuloVFSClassLoader.loadClass(AccumuloVFSClassLoader.java:110)
>   at 
> org.apache.accumulo.core.sample.impl.SamplerFactory.newSampler(SamplerFactory.java:36)
>   ... 12 more
> 2016-06-21 00:48:50,019 [tablet.Tablet] ERROR: MajC Unexpected exception, 
> extent = h<<
> java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.accumulo.core.sample.RowSampler
>   at 
> org.apache.accumulo.core.sample.impl.SamplerFactory.newSampler(SamplerFactory.java:45)
>   at 
> org.apache.accumulo.core.file.rfile.RFileOperations.openWriter(RFileOperations.java:91)
>   at 
> org.apache.accumulo.core.file.DispatchingFileFactory.openWriter(DispatchingFileFactory.java:74)
>   at 
> org.apache.accumulo.core.file.FileOperations$OpenWriterOperation.build(FileOperations.java:331)
>   at org.apache.accumulo.tserver.tablet.Compactor.call(Compactor.java:201)
>   at 
> org.apache.accumulo.tserver.tablet.Tablet._majorCompact(Tablet.java:1850)
>   at 
> org.apache.accumulo.tserver.tablet.Tablet.majorCompact(Tablet.java:1967)
>   at 
> org.apache.accumulo.tserver.tablet.CompactionRunner.run(CompactionRunner.java:44)
>   at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.accumulo.core.sample.RowSampler
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:178)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.accumulo.start.classloader.vfs.AccumuloVFSClassLoader.loadClass(AccumuloVFSClassLoader.java:110)
>   at 
> 

[jira] [Commented] (ACCUMULO-4350) ShellServerIT fails to load RowSampler class

2016-06-21 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15341629#comment-15341629
 ] 

Dave Marion commented on ACCUMULO-4350:
---

Will do.

> ShellServerIT fails to load RowSampler class
> 
>
> Key: ACCUMULO-4350
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4350
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Josh Elser
>Priority: Critical
> Fix For: 1.9.0
>
>
> Running {{JAVA_HOME=$(/usr/libexec/java_home -v1.8) mvn clean verify 
> -Dit.test=ShellServerIT -Dtest=foobar -DfailIfNoTests=false}}:
> I can see the following which is causing ShellServerIT to never exit:
> {noformat}
> 2016-06-21 00:48:23,675 [impl.ThriftTransportPool] WARN : Thread 
> "Time-limited test" stuck on IO to host.domain:61175 (0) for at least 120324 
> ms
> {noformat}
> Looking in the TabletServer logs for MAC
> {noformat}
> 2016-06-21 00:48:50,019 [tablet.Compactor] ERROR: 
> java.lang.ClassNotFoundException: org.apache.accumulo.core.sample.RowSampler
> java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.accumulo.core.sample.RowSampler
>   at 
> org.apache.accumulo.core.sample.impl.SamplerFactory.newSampler(SamplerFactory.java:45)
>   at 
> org.apache.accumulo.core.file.rfile.RFileOperations.openWriter(RFileOperations.java:91)
>   at 
> org.apache.accumulo.core.file.DispatchingFileFactory.openWriter(DispatchingFileFactory.java:74)
>   at 
> org.apache.accumulo.core.file.FileOperations$OpenWriterOperation.build(FileOperations.java:331)
>   at org.apache.accumulo.tserver.tablet.Compactor.call(Compactor.java:201)
>   at 
> org.apache.accumulo.tserver.tablet.Tablet._majorCompact(Tablet.java:1850)
>   at 
> org.apache.accumulo.tserver.tablet.Tablet.majorCompact(Tablet.java:1967)
>   at 
> org.apache.accumulo.tserver.tablet.CompactionRunner.run(CompactionRunner.java:44)
>   at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.accumulo.core.sample.RowSampler
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:178)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.accumulo.start.classloader.vfs.AccumuloVFSClassLoader.loadClass(AccumuloVFSClassLoader.java:110)
>   at 
> org.apache.accumulo.core.sample.impl.SamplerFactory.newSampler(SamplerFactory.java:36)
>   ... 12 more
> 2016-06-21 00:48:50,019 [tablet.Tablet] ERROR: MajC Unexpected exception, 
> extent = h<<
> java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> org.apache.accumulo.core.sample.RowSampler
>   at 
> org.apache.accumulo.core.sample.impl.SamplerFactory.newSampler(SamplerFactory.java:45)
>   at 
> org.apache.accumulo.core.file.rfile.RFileOperations.openWriter(RFileOperations.java:91)
>   at 
> org.apache.accumulo.core.file.DispatchingFileFactory.openWriter(DispatchingFileFactory.java:74)
>   at 
> org.apache.accumulo.core.file.FileOperations$OpenWriterOperation.build(FileOperations.java:331)
>   at org.apache.accumulo.tserver.tablet.Compactor.call(Compactor.java:201)
>   at 
> org.apache.accumulo.tserver.tablet.Tablet._majorCompact(Tablet.java:1850)
>   at 
> org.apache.accumulo.tserver.tablet.Tablet.majorCompact(Tablet.java:1967)
>   at 
> org.apache.accumulo.tserver.tablet.CompactionRunner.run(CompactionRunner.java:44)
>   at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.accumulo.core.sample.RowSampler
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:178)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.accumulo.start.classloader.vfs.AccumuloVFSClassLoader.loadClass(AccumuloVFSClassLoader.java:110)
>   at 
> org.apache.accumulo.core.sample.impl.SamplerFactory.newSampler(SamplerFactory.java:36)
>  

[jira] [Commented] (ACCUMULO-3923) bootstrap_hdfs.sh does not copy correct jars to hdfs

2016-06-20 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340873#comment-15340873
 ] 

Dave Marion commented on ACCUMULO-3923:
---

Yes, with ACCUMULO-4341

>  bootstrap_hdfs.sh does not copy correct jars to hdfs
> -
>
> Key: ACCUMULO-3923
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3923
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Dave Marion
>Priority: Critical
> Fix For: 1.7.2, 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Trying to make the VFS classloading stuff work and it doesn't seem like 
> ServiceLoader is finding any of the KeywordExecutable implementations.
> Best I can tell after looking into this, VFSClassLoader (created by 
> AccumuloVFSClassLoader) has all of the jars listed as resources, but when 
> ServiceLoader tries to find the META-INF/services definitions, it returns 
> nothing, and thus we think the keyword must be a class name. Seems like a 
> commons-vfs bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4329) Replication services don't have property to enable port-search

2016-06-20 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339470#comment-15339470
 ] 

Dave Marion commented on ACCUMULO-4329:
---

It's an issue in that the feature is not supported, but not critical as the 
workaround is to set replication.receipt.service.port to zero. Having said 
that, I don't know if the intent is to support the feature for this port. I 
would say close it for now, this is documentation enough for someone that runs 
into it.

> Replication services don't have property to enable port-search
> --
>
> Key: ACCUMULO-4329
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4329
> Project: Accumulo
>  Issue Type: Bug
>  Components: tserver
>Reporter: Dave Marion
> Fix For: 1.8.0
>
>
> TabletServer will not start if it cannot reserve the configured replication 
> port. The code passes `null` for enabling the port search feature, which 
> resolves to `false`, and the TServer fails.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ACCUMULO-4344) Fix PortRange property validation

2016-06-14 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4344.
---
Resolution: Fixed

> Fix PortRange property validation
> -
>
> Key: ACCUMULO-4344
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4344
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Dave Marion
>Assignee: Dave Marion
> Fix For: 1.8.0
>
>
> PortRange.parse should not modify the ports, just ensure that they are valid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (ACCUMULO-4344) Fix PortRange property validation

2016-06-14 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion reassigned ACCUMULO-4344:
-

Assignee: Dave Marion

> Fix PortRange property validation
> -
>
> Key: ACCUMULO-4344
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4344
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Dave Marion
>Assignee: Dave Marion
> Fix For: 1.8.0
>
>
> PortRange.parse should not modify the ports, just ensure that they are valid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ACCUMULO-4344) Fix PortRange property validation

2016-06-14 Thread Dave Marion (JIRA)
Dave Marion created ACCUMULO-4344:
-

 Summary: Fix PortRange property validation
 Key: ACCUMULO-4344
 URL: https://issues.apache.org/jira/browse/ACCUMULO-4344
 Project: Accumulo
  Issue Type: Bug
Reporter: Dave Marion
 Fix For: 1.8.0


PortRange.parse should not modify the ports, just ensure that they are valid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ACCUMULO-4331) Make port configuration and allocation consistent across services

2016-06-14 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion updated ACCUMULO-4331:
--
Attachment: ACCUMULO-4331-1.patch

> Make port configuration and allocation consistent across services
> -
>
> Key: ACCUMULO-4331
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4331
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Dave Marion
>Assignee: Dave Marion
> Fix For: 1.8.0
>
> Attachments: ACCUMULO-4331-1.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> There was some discussion in ACCUMULO-4328 about ports, so I decided to track 
> down how the client ports are configured and allocated. Issues raised in the 
> discussion were:
>  1. The port search feature was not well understood
>  2. Ephemeral port allocation makes it hard to lock servers down (e.g. 
> iptables)
> Looking through the code I found the following properties allocate a port 
> number based on conf.getPort(). This returns the port number based on the 
> property and supports either a single value or zero. Then, in the server 
> component (monitor, tracer, gc, etc) this value is used when creating a 
> ServerSocket. If the port is already in use, the process will fail.
> {noformat}
> monitor.port.log4j
> trace.port.client
> gc.port.client
> monitor.port.client
> {noformat}
> The following properties use TServerUtils.startServer which uses the value in 
> the property to start the TServer. If the value is zero, then it picks a 
> random port between 1024 and 65535. If tserver.port.search is enabled, then 
> it will try a thousand times to bind to a random port.
> {noformat}
> tserver.port.client
> master.port.client
> master.replication.coordinator.port
> replication.receipt.service.port
> {noformat}
> I'm proposing that we deprecate the tserver.port.search property and the 
> value zero in the property value for the properties above. Instead, I think 
> we should allow the user to specify a single value or a range (M-N). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ACCUMULO-4331) Make port configuration and allocation consistent across services

2016-06-14 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4331.
---
Resolution: Fixed

> Make port configuration and allocation consistent across services
> -
>
> Key: ACCUMULO-4331
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4331
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Dave Marion
>Assignee: Dave Marion
> Fix For: 1.8.0
>
> Attachments: ACCUMULO-4331-1.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> There was some discussion in ACCUMULO-4328 about ports, so I decided to track 
> down how the client ports are configured and allocated. Issues raised in the 
> discussion were:
>  1. The port search feature was not well understood
>  2. Ephemeral port allocation makes it hard to lock servers down (e.g. 
> iptables)
> Looking through the code I found the following properties allocate a port 
> number based on conf.getPort(). This returns the port number based on the 
> property and supports either a single value or zero. Then, in the server 
> component (monitor, tracer, gc, etc) this value is used when creating a 
> ServerSocket. If the port is already in use, the process will fail.
> {noformat}
> monitor.port.log4j
> trace.port.client
> gc.port.client
> monitor.port.client
> {noformat}
> The following properties use TServerUtils.startServer which uses the value in 
> the property to start the TServer. If the value is zero, then it picks a 
> random port between 1024 and 65535. If tserver.port.search is enabled, then 
> it will try a thousand times to bind to a random port.
> {noformat}
> tserver.port.client
> master.port.client
> master.replication.coordinator.port
> replication.receipt.service.port
> {noformat}
> I'm proposing that we deprecate the tserver.port.search property and the 
> value zero in the property value for the properties above. Instead, I think 
> we should allow the user to specify a single value or a range (M-N). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ACCUMULO-4341) ServiceLoader deadlock with classes loaded from HDFS

2016-06-14 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion resolved ACCUMULO-4341.
---
Resolution: Fixed

> ServiceLoader deadlock with classes loaded from HDFS
> 
>
> Key: ACCUMULO-4341
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4341
> Project: Accumulo
>  Issue Type: Bug
>  Components: client
>Affects Versions: 1.7.0, 1.8.0
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Blocker
> Fix For: 1.7.2, 1.8.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> With Accumulo set up to use general.vfs.classpaths to load classes from HDFS, 
> running `accumulo help` will hang. 
> A jstack of the process shows the IPC Client thread at:
> {noformat}
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2051)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:91)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1086)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)
> {noformat}
> and the main thread at:
> {noformat}
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1454)
>   - locked <0xf09a2898> (a org.apache.hadoop.ipc.Client$Call)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1982)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
>   at 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileObject.doAttach(HdfsFileObject.java:85)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.attach(AbstractFileObject.java:173)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.getContent(AbstractFileObject.java:1236)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.getPermissions(VFSClassLoader.java:300)
>   at 
> java.security.SecureClassLoader.getProtectionDomain(SecureClassLoader.java:206)
>   - locked <0xf5ad9138> (a java.util.HashMap)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.defineClass(VFSClassLoader.java:226)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:180)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   - locked <0xf5af3b88> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>   - locked <0xf6f5c2f8> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370)
>   at 

[jira] [Commented] (ACCUMULO-4341) ServiceLoader deadlock with classes loaded from HDFS

2016-06-14 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15329491#comment-15329491
 ] 

Dave Marion commented on ACCUMULO-4341:
---

Backported ACCUMULO-3923 to 1.7.2-SNAPSHOT and applied this fix to the same 
branch. Tested using the following process:

{noformat}
1. tar zxf accumulo-1.7.2-SNAPSHOT-bin.tar.gz
2. cd accumulo-1.7.2-SNAPSHOT/bin
3. ./build_native_library.sh
4. ./bootstrap_config.sh
5. Update accumulo-env.sh
6. In accumulo-site.xml, set instance.volumes and add:

  
general.vfs.classpaths

hdfs://:/accumulo-1.7.2-SNAPSHOT-system-classpath/.*.jar
  

7. ./bootstrap_hdfs.sh
8. accumulo init
9. start-all.sh
{noformat}


> ServiceLoader deadlock with classes loaded from HDFS
> 
>
> Key: ACCUMULO-4341
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4341
> Project: Accumulo
>  Issue Type: Bug
>  Components: client
>Affects Versions: 1.7.0, 1.8.0
>Reporter: Dave Marion
>Priority: Blocker
> Fix For: 1.7.2, 1.8.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> With Accumulo set up to use general.vfs.classpaths to load classes from HDFS, 
> running `accumulo help` will hang. 
> A jstack of the process shows the IPC Client thread at:
> {noformat}
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2051)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:91)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1086)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)
> {noformat}
> and the main thread at:
> {noformat}
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1454)
>   - locked <0xf09a2898> (a org.apache.hadoop.ipc.Client$Call)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1982)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
>   at 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileObject.doAttach(HdfsFileObject.java:85)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.attach(AbstractFileObject.java:173)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.getContent(AbstractFileObject.java:1236)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.getPermissions(VFSClassLoader.java:300)
>   at 
> java.security.SecureClassLoader.getProtectionDomain(SecureClassLoader.java:206)
>   - locked <0xf5ad9138> (a java.util.HashMap)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.defineClass(VFSClassLoader.java:226)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:180)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   - locked <0xf5af3b88> (a 
> 

[jira] [Updated] (ACCUMULO-4341) ServiceLoader deadlock with classes loaded from HDFS

2016-06-14 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion updated ACCUMULO-4341:
--
Affects Version/s: 1.7.0

> ServiceLoader deadlock with classes loaded from HDFS
> 
>
> Key: ACCUMULO-4341
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4341
> Project: Accumulo
>  Issue Type: Bug
>  Components: client
>Affects Versions: 1.7.0, 1.8.0
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Blocker
> Fix For: 1.7.2, 1.8.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> With Accumulo set up to use general.vfs.classpaths to load classes from HDFS, 
> running `accumulo help` will hang. 
> A jstack of the process shows the IPC Client thread at:
> {noformat}
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2051)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:91)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1086)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)
> {noformat}
> and the main thread at:
> {noformat}
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1454)
>   - locked <0xf09a2898> (a org.apache.hadoop.ipc.Client$Call)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1982)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
>   at 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileObject.doAttach(HdfsFileObject.java:85)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.attach(AbstractFileObject.java:173)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.getContent(AbstractFileObject.java:1236)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.getPermissions(VFSClassLoader.java:300)
>   at 
> java.security.SecureClassLoader.getProtectionDomain(SecureClassLoader.java:206)
>   - locked <0xf5ad9138> (a java.util.HashMap)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.defineClass(VFSClassLoader.java:226)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:180)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   - locked <0xf5af3b88> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>   - locked <0xf6f5c2f8> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370)
>   at 

[jira] [Assigned] (ACCUMULO-4341) ServiceLoader deadlock with classes loaded from HDFS

2016-06-14 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion reassigned ACCUMULO-4341:
-

Assignee: Dave Marion

> ServiceLoader deadlock with classes loaded from HDFS
> 
>
> Key: ACCUMULO-4341
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4341
> Project: Accumulo
>  Issue Type: Bug
>  Components: client
>Affects Versions: 1.7.0, 1.8.0
>Reporter: Dave Marion
>Assignee: Dave Marion
>Priority: Blocker
> Fix For: 1.7.2, 1.8.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> With Accumulo set up to use general.vfs.classpaths to load classes from HDFS, 
> running `accumulo help` will hang. 
> A jstack of the process shows the IPC Client thread at:
> {noformat}
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2051)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:91)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1086)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)
> {noformat}
> and the main thread at:
> {noformat}
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1454)
>   - locked <0xf09a2898> (a org.apache.hadoop.ipc.Client$Call)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1982)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
>   at 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileObject.doAttach(HdfsFileObject.java:85)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.attach(AbstractFileObject.java:173)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.getContent(AbstractFileObject.java:1236)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.getPermissions(VFSClassLoader.java:300)
>   at 
> java.security.SecureClassLoader.getProtectionDomain(SecureClassLoader.java:206)
>   - locked <0xf5ad9138> (a java.util.HashMap)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.defineClass(VFSClassLoader.java:226)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:180)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   - locked <0xf5af3b88> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>   - locked <0xf6f5c2f8> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370)
>   at 

[jira] [Updated] (ACCUMULO-3923) bootstrap_hdfs.sh does not copy correct jars to hdfs

2016-06-14 Thread Dave Marion (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Marion updated ACCUMULO-3923:
--
Fix Version/s: 1.7.2

>  bootstrap_hdfs.sh does not copy correct jars to hdfs
> -
>
> Key: ACCUMULO-3923
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3923
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Dave Marion
>Priority: Critical
> Fix For: 1.7.2, 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Trying to make the VFS classloading stuff work and it doesn't seem like 
> ServiceLoader is finding any of the KeywordExecutable implementations.
> Best I can tell after looking into this, VFSClassLoader (created by 
> AccumuloVFSClassLoader) has all of the jars listed as resources, but when 
> ServiceLoader tries to find the META-INF/services definitions, it returns 
> nothing, and thus we think the keyword must be a class name. Seems like a 
> commons-vfs bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-3923) bootstrap_hdfs.sh does not copy correct jars to hdfs

2016-06-13 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328621#comment-15328621
 ] 

Dave Marion commented on ACCUMULO-3923:
---

I will cherry-pick this back to 1.7.2 and test it with ACCUMULO-4341 tomorrow.

>  bootstrap_hdfs.sh does not copy correct jars to hdfs
> -
>
> Key: ACCUMULO-3923
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3923
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Dave Marion
>Priority: Critical
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Trying to make the VFS classloading stuff work and it doesn't seem like 
> ServiceLoader is finding any of the KeywordExecutable implementations.
> Best I can tell after looking into this, VFSClassLoader (created by 
> AccumuloVFSClassLoader) has all of the jars listed as resources, but when 
> ServiceLoader tries to find the META-INF/services definitions, it returns 
> nothing, and thus we think the keyword must be a class name. Seems like a 
> commons-vfs bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-3923) bootstrap_hdfs.sh does not copy correct jars to hdfs

2016-06-13 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328348#comment-15328348
 ] 

Dave Marion commented on ACCUMULO-3923:
---

When did we move to slf4j?

>  bootstrap_hdfs.sh does not copy correct jars to hdfs
> -
>
> Key: ACCUMULO-3923
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3923
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Dave Marion
>Priority: Critical
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Trying to make the VFS classloading stuff work and it doesn't seem like 
> ServiceLoader is finding any of the KeywordExecutable implementations.
> Best I can tell after looking into this, VFSClassLoader (created by 
> AccumuloVFSClassLoader) has all of the jars listed as resources, but when 
> ServiceLoader tries to find the META-INF/services definitions, it returns 
> nothing, and thus we think the keyword must be a class name. Seems like a 
> commons-vfs bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4341) ServiceLoader deadlock with classes loaded from HDFS

2016-06-13 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15328335#comment-15328335
 ] 

Dave Marion commented on ACCUMULO-4341:
---

I think I have this resolved. It's all working now locally, to include running 
bootstrap_hdfs.sh and running Accumulo with the jars out of HDFS. I'm going to 
put up a small patch against 1.8.0. I didn't have time to test against 1.7. If 
someone gets to it before I do, feel free to apply the patch and close my PR.

> ServiceLoader deadlock with classes loaded from HDFS
> 
>
> Key: ACCUMULO-4341
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4341
> Project: Accumulo
>  Issue Type: Bug
>  Components: client
>Affects Versions: 1.8.0
>Reporter: Dave Marion
>Priority: Blocker
> Fix For: 1.7.2, 1.8.0
>
>
> With Accumulo set up to use general.vfs.classpaths to load classes from HDFS, 
> running `accumulo help` will hang. 
> A jstack of the process shows the IPC Client thread at:
> {noformat}
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2051)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:91)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1086)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)
> {noformat}
> and the main thread at:
> {noformat}
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1454)
>   - locked <0xf09a2898> (a org.apache.hadoop.ipc.Client$Call)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1982)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
>   at 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileObject.doAttach(HdfsFileObject.java:85)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.attach(AbstractFileObject.java:173)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.getContent(AbstractFileObject.java:1236)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.getPermissions(VFSClassLoader.java:300)
>   at 
> java.security.SecureClassLoader.getProtectionDomain(SecureClassLoader.java:206)
>   - locked <0xf5ad9138> (a java.util.HashMap)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.defineClass(VFSClassLoader.java:226)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:180)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   - locked <0xf5af3b88> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>   - locked <0xf6f5c2f8> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 

[jira] [Commented] (ACCUMULO-4341) ServiceLoader deadlock with classes loaded from HDFS

2016-06-13 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327966#comment-15327966
 ] 

Dave Marion commented on ACCUMULO-4341:
---

I believe the VFS code, without the ServiceLoader, will work. It works in 1.6.

> ServiceLoader deadlock with classes loaded from HDFS
> 
>
> Key: ACCUMULO-4341
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4341
> Project: Accumulo
>  Issue Type: Bug
>  Components: client
>Affects Versions: 1.8.0
>Reporter: Dave Marion
>Priority: Blocker
> Fix For: 1.7.2, 1.8.0
>
>
> With Accumulo set up to use general.vfs.classpaths to load classes from HDFS, 
> running `accumulo help` will hang. 
> A jstack of the process shows the IPC Client thread at:
> {noformat}
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2051)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:91)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1086)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)
> {noformat}
> and the main thread at:
> {noformat}
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1454)
>   - locked <0xf09a2898> (a org.apache.hadoop.ipc.Client$Call)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1982)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
>   at 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileObject.doAttach(HdfsFileObject.java:85)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.attach(AbstractFileObject.java:173)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.getContent(AbstractFileObject.java:1236)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.getPermissions(VFSClassLoader.java:300)
>   at 
> java.security.SecureClassLoader.getProtectionDomain(SecureClassLoader.java:206)
>   - locked <0xf5ad9138> (a java.util.HashMap)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.defineClass(VFSClassLoader.java:226)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:180)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   - locked <0xf5af3b88> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>   - locked <0xf6f5c2f8> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:370)
>   at 

[jira] [Commented] (ACCUMULO-3923) bootstrap_hdfs.sh does not copy correct jars to hdfs

2016-06-13 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327965#comment-15327965
 ] 

Dave Marion commented on ACCUMULO-3923:
---

bq. Any reason you didn't commit your change to 1.7 as well, Dave Marion? (1.6 
too even?)

 Oversight on my part. It can be cherry-picked back to any release that uses 
slf4j.

>  bootstrap_hdfs.sh does not copy correct jars to hdfs
> -
>
> Key: ACCUMULO-3923
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3923
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Josh Elser
>Assignee: Dave Marion
>Priority: Critical
> Fix For: 1.8.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Trying to make the VFS classloading stuff work and it doesn't seem like 
> ServiceLoader is finding any of the KeywordExecutable implementations.
> Best I can tell after looking into this, VFSClassLoader (created by 
> AccumuloVFSClassLoader) has all of the jars listed as resources, but when 
> ServiceLoader tries to find the META-INF/services definitions, it returns 
> nothing, and thus we think the keyword must be a class name. Seems like a 
> commons-vfs bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ACCUMULO-4341) ServiceLoader deadlock with classes loaded from HDFS

2016-06-13 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15327854#comment-15327854
 ] 

Dave Marion commented on ACCUMULO-4341:
---

I'm wondering if this is really due to a change in the ClassLoader object 
between Java 6 and 7. If you look at the javadoc for the two-arg form of 
CLassLoader.loadClass(), Java 7 introduced a lock and parallel capable class 
loaders.

> ServiceLoader deadlock with classes loaded from HDFS
> 
>
> Key: ACCUMULO-4341
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4341
> Project: Accumulo
>  Issue Type: Bug
>  Components: client
>Affects Versions: 1.8.0
>Reporter: Dave Marion
>Priority: Blocker
> Fix For: 1.7.2, 1.8.0
>
>
> With Accumulo set up to use general.vfs.classpaths to load classes from HDFS, 
> running `accumulo help` will hang. 
> A jstack of the process shows the IPC Client thread at:
> {noformat}
>java.lang.Thread.State: BLOCKED (on object monitor)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2051)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:91)
>   at 
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
>   at 
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
>   at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1086)
>   at org.apache.hadoop.ipc.Client$Connection.run(Client.java:966)
> {noformat}
> and the main thread at:
> {noformat}
>java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:502)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1454)
>   - locked <0xf09a2898> (a org.apache.hadoop.ipc.Client$Call)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)
>   at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>   at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
>   at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1982)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124)
>   at 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileObject.doAttach(HdfsFileObject.java:85)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.attach(AbstractFileObject.java:173)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.provider.AbstractFileObject.getContent(AbstractFileObject.java:1236)
>   - locked <0xf57fd008> (a 
> org.apache.commons.vfs2.provider.hdfs.HdfsFileSystem)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.getPermissions(VFSClassLoader.java:300)
>   at 
> java.security.SecureClassLoader.getProtectionDomain(SecureClassLoader.java:206)
>   - locked <0xf5ad9138> (a java.util.HashMap)
>   at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.defineClass(VFSClassLoader.java:226)
>   at 
> org.apache.commons.vfs2.impl.VFSClassLoader.findClass(VFSClassLoader.java:180)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   - locked <0xf5af3b88> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:411)
>   - locked <0xf6f5c2f8> (a 
> org.apache.commons.vfs2.impl.VFSClassLoader)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at 

  1   2   3   4   5   6   >