> On Dec. 21, 2018, 4:21 p.m., András Piros wrote:
> > core/src/main/java/org/apache/oozie/command/PurgeXCommand.java
> > Lines 81 (patched)
> > <https://reviews.apache.org/r/69594/diff/2/?file=2115651#file2115651line81>
> >
> >     What's the semantics of `T` and `U`?

T is the outer interface. This class takes a T node and traverses the tree and 
returns List<T>. During the traverse it stores intermediate results using class 
U instead of T.


> On Dec. 21, 2018, 4:21 p.m., András Piros wrote:
> > core/src/main/java/org/apache/oozie/command/PurgeXCommand.java
> > Lines 105-107 (patched)
> > <https://reviews.apache.org/r/69594/diff/2/?file=2115651#file2115651line105>
> >
> >     So you're checking whether there was a duplicate... I think it's a lot 
> > of `equals()` call when speaking of a `Set#addAll()` in a `while` loop.
> >     
> >     Do you think we need optimizing on that? About how many times is 
> > `Set#addAll()` called?

The number of addAll calls depends on the structure of the tree, but addAll is 
basically just a series of add() calls. The total number of add() calls are the 
same as the number of descendant nodes (n). HashSet offers contant time basic 
operations (e.g. add), so this is still O(n) just like in the original version.

It is still slower of course, because the constant factor is bigger. And it 
consumes more memory because we have an ArrayList and a HashSet at the same 
time.

If we only use an ArrayList (to avoid to consume more memory) then the search 
for the duplicate will be slower and we cannot keep O(n).

I'm unable to find a way to search for duplicates without descreasing the speed 
and increasing the memory consumption. We should decide which one is more 
important: recognizing cycles or speed. A cycle in the oozie db could cause 
lots of problems (infinite loops in several places) but probably it's very-very 
rare and happens only if someone manually modifies the database.


> On Dec. 21, 2018, 4:21 p.m., András Piros wrote:
> > core/src/main/java/org/apache/oozie/command/PurgeXCommand.java
> > Lines 109-111 (patched)
> > <https://reviews.apache.org/r/69594/diff/2/?file=2115651#file2115651line109>
> >
> >     When not all the descendant nodes were selected, we don't select any on 
> > this level, right?

If not all the descendant nodes were selected then traver returns an empty list 
because we don't select any node. Not only on this level, we don't select any 
node at all. So we either delete all the descendant workflows during purge, or 
we keep the whole tree of the descendant nodes.


- Andras


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69594/#review211500
-----------------------------------------------------------


On Dec. 21, 2018, 3:17 p.m., Andras Salamon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69594/
> -----------------------------------------------------------
> 
> (Updated Dec. 21, 2018, 3:17 p.m.)
> 
> 
> Review request for oozie, András Piros and Kinga Marton.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> OOZIE-3400: Fix PurgeService sub-sub-workflow checking
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/ErrorCode.java 9cc153bb0 
>   core/src/main/java/org/apache/oozie/command/PurgeXCommand.java 42c3b28a6 
>   core/src/test/java/org/apache/oozie/command/TestPurgeXCommand.java 
> d11fcffbb 
>   core/src/test/java/org/apache/oozie/command/TestSelectorTreeTraverser.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69594/diff/2/
> 
> 
> Testing
> -------
> 
> Run TestPurgeXCommand unit tests locally.
> 
> 
> Thanks,
> 
> Andras Salamon
> 
>

Reply via email to