Accumulo-Master - Build # 2073 - Fixed

2017-05-02 Thread Apache Jenkins Server
The Apache Jenkins build system has built Accumulo-Master (build #2073)

Status: Fixed

Check console output at https://builds.apache.org/job/Accumulo-Master/2073/ to 
view the results.

Accumulo-1.8 - Build # 162 - Still Failing

2017-05-02 Thread Apache Jenkins Server
The Apache Jenkins build system has built Accumulo-1.8 (build #162)

Status: Still Failing

Check console output at https://builds.apache.org/job/Accumulo-1.8/162/ to view 
the results.

Accumulo-1.7 - Build # 338 - Failure

2017-05-02 Thread Apache Jenkins Server
The Apache Jenkins build system has built Accumulo-1.7 (build #338)

Status: Failure

Check console output at https://builds.apache.org/job/Accumulo-1.7/338/ to view 
the results.

[jira] [Commented] (ACCUMULO-3521) Unused Iterator-related classes

2017-05-02 Thread Michael Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993653#comment-15993653
 ] 

Michael Miller commented on ACCUMULO-3521:
--

[~dlmarion] I didn't remove anything.  I added tests and did some cleanup to 
FirstEntryInRowIterator and TypedValueCombiner. See 
https://github.com/apache/accumulo/pull/237/files

I also added a test for StatsCombiner in 
https://github.com/apache/accumulo-examples

> Unused Iterator-related classes
> ---
>
> Key: ACCUMULO-3521
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3521
> Project: Accumulo
>  Issue Type: Sub-task
>Reporter: Christopher Tubbs
>Assignee: Michael Miller
> Fix For: 2.0.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The following iterators and iterator-related utilities have no references at 
> all in our code; minimally, they should have unit tests (this list does not 
> include deprecated iterators):
> # OrIterator
> # FirstEntryInRowIterator (especially setNumScansBeforeSeek method is never 
> called)
> # TypedValueCombiner (especially setLossyness method is never called)
> # IteratorUtil.getMaxPriority and .findIterator methods
> # StatsCombiner (available in the examples) does not use the setRadix method
> Note: these iterators may not be considered "public API", but might still be 
> unsafe to remove the unused portions if somebody had used them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-3521) Unused Iterator-related classes

2017-05-02 Thread Dave Marion (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993636#comment-15993636
 ] 

Dave Marion commented on ACCUMULO-3521:
---

For the items removed in 2.0, are they marked deprecated in 1.8.x?

> Unused Iterator-related classes
> ---
>
> Key: ACCUMULO-3521
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3521
> Project: Accumulo
>  Issue Type: Sub-task
>Reporter: Christopher Tubbs
>Assignee: Michael Miller
> Fix For: 2.0.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The following iterators and iterator-related utilities have no references at 
> all in our code; minimally, they should have unit tests (this list does not 
> include deprecated iterators):
> # OrIterator
> # FirstEntryInRowIterator (especially setNumScansBeforeSeek method is never 
> called)
> # TypedValueCombiner (especially setLossyness method is never called)
> # IteratorUtil.getMaxPriority and .findIterator methods
> # StatsCombiner (available in the examples) does not use the setRadix method
> Note: these iterators may not be considered "public API", but might still be 
> unsafe to remove the unused portions if somebody had used them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-4629) Seeking in timestamp range is slow

2017-05-02 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993506#comment-15993506
 ] 

Keith Turner commented on ACCUMULO-4629:


Another possible option is to allow turning on statefulness in the deleting 
iterator.  By default it would not be on, but an iterator could turn it on.

> Seeking in timestamp range is slow
> --
>
> Key: ACCUMULO-4629
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4629
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
> Fix For: 2.0.0
>
>
> Fluo's internal schema uses the first 4 bits of the timestamp to store 
> different types of information per column.  These first 4 bits divide the 
> timestamp space up into 16 ranges.  Fluo has server side iterators that 
> consider information in one of these 16 ranges and then seek forward to 
> another of the 16 ranges.
> Unfortunately, Accumulo's built in iterator that processes delete marker 
> makes seeking within the timestamps of a column an O(N^2) operation.  This is 
> because of the way deletes work in Accumulo.  A delete marker for timestamp X 
> in Accumulo deletes anything with a timestamp <= X.  
> When seeking to timestamp Y, the Accumulo iterator that handles deletes will 
> scan from MAX_LONG to Y looking for any deletes that may keep you from seeing 
> data at timestamp Y.  The following example shows what the delete iterator 
> will do when a user iterator does some seeks.
>  * User iterator seeks to stamp 1,000,000.  This causes the delete iter to 
> scan from MAX_LONG to 1,000,000 looking for delete markers.
>  * User iterator seeks to stamp 900,000.  This causes the delete iter to scan 
> from MAX_LONG to 900,000 looking for delete markers.
>  * User iterator seeks to stamp 500,000.  This causes the delete iter to scan 
> from MAX_LONG to 500,000 looking for delete markers.
> So Fluo can seek efficiently, it has done some [serious 
> shenanigans|https://github.com/apache/incubator-fluo/blob/rel/fluo-1.0.0-incubating/modules/accumulo/src/main/java/org/apache/fluo/accumulo/iterators/TimestampSkippingIterator.java#L164]
>  using reflection to remove  the DeleteIterator.  The great work being done 
> on ACCUMULO-3079 will likely break this crazy reflection code.  So I would 
> like to make a change to upstream Accumulo that allows efficient seeking in 
> the timestamp range.  I have thought of the following two possible solutions.
>  * Make the DeleteIterator stateful so that it remember what ranges it has 
> scanned for deletes.  I don't really like this solution because it will add 
> expense to every seek in Accumulo for an edge case.
>  * Make it possible to create tables with an exact delete behavior. Meaning a 
> delete for timestamp X will only delete an existing row column with that 
> exact timestamp.  This option could only be chosen at table creation time and 
> could not be changed.  For this delete behavior, the delete iterator doesnot 
> need to scan for every seek.
> Are there other possible solutions?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-4629) Seeking in timestamp range is slow

2017-05-02 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993503#comment-15993503
 ] 

Keith Turner commented on ACCUMULO-4629:


I discussed this with [~dlmarion] offline.  Another possible option came out of 
this discussion. Could have a table option to change how delete are sorted.  
Could sort on delete and then timestamp.  This would make all deletes for a 
column sort to the beginning of the column.  With the deletes at the beginning 
of the column, the delete iterator could handle seeking a bit more efficiently. 
 It still would not be as efficient as ignoring deletes, because each seek of 
the deleting iter would require it to seek backwards to the beginning of the 
column.  It would be an improvement in that entire column no longer needs to be 
scanned.

> Seeking in timestamp range is slow
> --
>
> Key: ACCUMULO-4629
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4629
> Project: Accumulo
>  Issue Type: Bug
>Reporter: Keith Turner
> Fix For: 2.0.0
>
>
> Fluo's internal schema uses the first 4 bits of the timestamp to store 
> different types of information per column.  These first 4 bits divide the 
> timestamp space up into 16 ranges.  Fluo has server side iterators that 
> consider information in one of these 16 ranges and then seek forward to 
> another of the 16 ranges.
> Unfortunately, Accumulo's built in iterator that processes delete marker 
> makes seeking within the timestamps of a column an O(N^2) operation.  This is 
> because of the way deletes work in Accumulo.  A delete marker for timestamp X 
> in Accumulo deletes anything with a timestamp <= X.  
> When seeking to timestamp Y, the Accumulo iterator that handles deletes will 
> scan from MAX_LONG to Y looking for any deletes that may keep you from seeing 
> data at timestamp Y.  The following example shows what the delete iterator 
> will do when a user iterator does some seeks.
>  * User iterator seeks to stamp 1,000,000.  This causes the delete iter to 
> scan from MAX_LONG to 1,000,000 looking for delete markers.
>  * User iterator seeks to stamp 900,000.  This causes the delete iter to scan 
> from MAX_LONG to 900,000 looking for delete markers.
>  * User iterator seeks to stamp 500,000.  This causes the delete iter to scan 
> from MAX_LONG to 500,000 looking for delete markers.
> So Fluo can seek efficiently, it has done some [serious 
> shenanigans|https://github.com/apache/incubator-fluo/blob/rel/fluo-1.0.0-incubating/modules/accumulo/src/main/java/org/apache/fluo/accumulo/iterators/TimestampSkippingIterator.java#L164]
>  using reflection to remove  the DeleteIterator.  The great work being done 
> on ACCUMULO-3079 will likely break this crazy reflection code.  So I would 
> like to make a change to upstream Accumulo that allows efficient seeking in 
> the timestamp range.  I have thought of the following two possible solutions.
>  * Make the DeleteIterator stateful so that it remember what ranges it has 
> scanned for deletes.  I don't really like this solution because it will add 
> expense to every seek in Accumulo for an edge case.
>  * Make it possible to create tables with an exact delete behavior. Meaning a 
> delete for timestamp X will only delete an existing row column with that 
> exact timestamp.  This option could only be chosen at table creation time and 
> could not be changed.  For this delete behavior, the delete iterator doesnot 
> need to scan for every seek.
> Are there other possible solutions?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (ACCUMULO-4365) ShellServerIT#trace() failing intermittently due to missing "sendMutations" block

2017-05-02 Thread Michael Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Miller reassigned ACCUMULO-4365:


Assignee: Michael Miller

> ShellServerIT#trace() failing intermittently due to missing "sendMutations" 
> block
> -
>
> Key: ACCUMULO-4365
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4365
> Project: Accumulo
>  Issue Type: Bug
>  Components: test
>Reporter: Josh Elser
>Assignee: Michael Miller
> Fix For: 1.8.2, 2.0.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Noticed this on master, but not sure if it also affects other branches.
> {noformat}
> trace(org.apache.accumulo.test.ShellServerIT)  Time elapsed: 5.166 sec  <<< 
> FAILURE!
> java.lang.AssertionError
>   at org.apache.accumulo.test.ShellServerIT.trace(ShellServerIT.java:1630)
> {noformat}
> This is a trace that was observed when the test case failed.
> {noformat}
> Trace started at 2016/07/10 22:43:38.277
> Time  Start  Service@Location   Name
>  3446+0  shell@localhost shell:root
> 1+160  master@0.0.0.0 beginFateOperation
> 4+167  master@0.0.0.0 executeFateOperation
> 3+173master@0.0.0.0 CreateTable
> 2+176master@0.0.0.0 CreateTable
>16+181  master@0.0.0.0 SetupPermissions
> 4+200master@0.0.0.0 PopulateZookeeper
>19+204master@0.0.0.0 PopulateZookeeper
> 1+694  master@0.0.0.0 ChooseDir
> 1+709master@0.0.0.0 CreateDir
> 2+712  master@0.0.0.0 PopulateMetadata
> 1+713tserver@0.0.0.0 update
> 1+713  tserver@0.0.0.0 prep
> 5+716master@0.0.0.0 FinishCreateTable
>   563+172  master@0.0.0.0 waitForFateOperation
> 2+736  master@0.0.0.0 finishFateOperation
>  1513+745  shell@localhost close
>13+746shell@localhost BinMutations 1
> 5+746  shell@localhost binMutations
> 2+748tserver@0.0.0.0 startScan
> 1+748  tserver@0.0.0.0 metadata tablets read ahead 5
> 3+2259 tserver@0.0.0.0 getTableConfiguration
> 3+2263 tserver@0.0.0.0 getTableConfiguration
> 3+2267 tserver@0.0.0.0 getTableConfiguration
> 3+2270 tserver@0.0.0.0 getTableConfiguration
> 3+2281 shell@localhost scan
> 2+2282   shell@localhost scan:location
> 2+2282 tserver@0.0.0.0 startScan
> 2+2282   tserver@0.0.0.0 tablet read ahead 6
> 7+2285 master@0.0.0.0 beginFateOperation
> 2+2293 master@0.0.0.0 executeFateOperation
> 3+2297   master@0.0.0.0 DeleteTable
> 1+2300   master@0.0.0.0 DeleteTable
> 4+2413 master@0.0.0.0 CleanUp
> 2+2415   master@0.0.0.0 scan
> 1+2415 master@0.0.0.0 scan:location
> 1+2415   tserver@0.0.0.0 startScan
> 1+2415 tserver@0.0.0.0 metadata tablets read ahead 6
>20+2417 master@0.0.0.0 CleanUp
> 2+2417   master@0.0.0.0 batch scanner 555- 1
> 1+2417 master@0.0.0.0 client:startMultiScan
> 1+2418 tserver@0.0.0.0 startMultiScan
> 1+2418   tserver@0.0.0.0 metadata tablets read ahead 7
> 1+2420   master@0.0.0.0 scan
> 1+2420 master@0.0.0.0 scan:location
> 1+2420   tserver@0.0.0.0 startScan
> 1+2420 tserver@0.0.0.0 metadata tablets read ahead 1
> 2+2421   master@0.0.0.0 close
> 1+2421 master@0.0.0.0 BinMutations 1
> 1+2421   master@0.0.0.0 binMutations
> 1+2423   master@0.0.0.0 scan
> 1+2423 master@0.0.0.0 scan:location
> 1+2423   tserver@0.0.0.0 startScan
> 1+2423 tserver@0.0.0.0 metadata tablets read ahead 8
>   145+2296 master@0.0.0.0 waitForFateOperation
> 1+2441 master@0.0.0.0 finishFateOperation
> {noformat}
> In another run where the test did not fail:
> {noformat}
> Trace started at 2016/07/10 22:48:06.432
> Time  Start  Service@Location   Name
>  3066+0  shell@localhost shell:root
> 5+210  master@0.0.0.0 beginFateOperation
> 4+222  master@0.0.0.0 executeFateOperation
> 2+228master@0.0.0.0 CreateTable
> 2+230master@0.0.0.0 CreateTable
>15+235  master@0.0.0.0 SetupPermissions
> 1+252master@0.0.0.0 PopulateZookeeper
>10+253master@0.0.0.0 PopulateZookeeper
> 2+266  master@0.0.0.0 ChooseDir
>70+227  master@0.0.0.0 waitForFateOperation
> 2+298  master@0.0.0.0 finishFateOperation
>  1511+306  shell@localhost close
> 9+306  

[jira] [Resolved] (ACCUMULO-3521) Unused Iterator-related classes

2017-05-02 Thread Michael Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Miller resolved ACCUMULO-3521.
--
Resolution: Fixed

> Unused Iterator-related classes
> ---
>
> Key: ACCUMULO-3521
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3521
> Project: Accumulo
>  Issue Type: Sub-task
>Reporter: Christopher Tubbs
>Assignee: Michael Miller
> Fix For: 2.0.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The following iterators and iterator-related utilities have no references at 
> all in our code; minimally, they should have unit tests (this list does not 
> include deprecated iterators):
> # OrIterator
> # FirstEntryInRowIterator (especially setNumScansBeforeSeek method is never 
> called)
> # TypedValueCombiner (especially setLossyness method is never called)
> # IteratorUtil.getMaxPriority and .findIterator methods
> # StatsCombiner (available in the examples) does not use the setRadix method
> Note: these iterators may not be considered "public API", but might still be 
> unsafe to remove the unused portions if somebody had used them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-3521) Unused Iterator-related classes

2017-05-02 Thread Christopher Tubbs (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992986#comment-15992986
 ] 

Christopher Tubbs commented on ACCUMULO-3521:
-

Sure. Any progress at all on this will have been useful. Can always re-run 
UCDetector again in the future.

> Unused Iterator-related classes
> ---
>
> Key: ACCUMULO-3521
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3521
> Project: Accumulo
>  Issue Type: Sub-task
>Reporter: Christopher Tubbs
>Assignee: Michael Miller
> Fix For: 2.0.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The following iterators and iterator-related utilities have no references at 
> all in our code; minimally, they should have unit tests (this list does not 
> include deprecated iterators):
> # OrIterator
> # FirstEntryInRowIterator (especially setNumScansBeforeSeek method is never 
> called)
> # TypedValueCombiner (especially setLossyness method is never called)
> # IteratorUtil.getMaxPriority and .findIterator methods
> # StatsCombiner (available in the examples) does not use the setRadix method
> Note: these iterators may not be considered "public API", but might still be 
> unsafe to remove the unused portions if somebody had used them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (ACCUMULO-3521) Unused Iterator-related classes

2017-05-02 Thread Michael Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15992981#comment-15992981
 ] 

Michael Miller commented on ACCUMULO-3521:
--

[~ctubbsii] With the updates in PR #237, Josh's in 
https://github.com/apache/accumulo/pull/247 and the test I added in 
https://github.com/apache/accumulo-examples/commit/bec846ce3dec6b594e5e54a1349a99f5202dd91b
 can I mark this resolved?  I think the other methods in (4.) have been removed.

> Unused Iterator-related classes
> ---
>
> Key: ACCUMULO-3521
> URL: https://issues.apache.org/jira/browse/ACCUMULO-3521
> Project: Accumulo
>  Issue Type: Sub-task
>Reporter: Christopher Tubbs
>Assignee: Michael Miller
> Fix For: 2.0.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> The following iterators and iterator-related utilities have no references at 
> all in our code; minimally, they should have unit tests (this list does not 
> include deprecated iterators):
> # OrIterator
> # FirstEntryInRowIterator (especially setNumScansBeforeSeek method is never 
> called)
> # TypedValueCombiner (especially setLossyness method is never called)
> # IteratorUtil.getMaxPriority and .findIterator methods
> # StatsCombiner (available in the examples) does not use the setRadix method
> Note: these iterators may not be considered "public API", but might still be 
> unsafe to remove the unused portions if somebody had used them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)