Adding a REVERSIBLE characteristic to spliterators is easy enough, and as you say, many useful sources can easily provide an efficient reverse operation.  Filtering and mapping can preserve reversibility. The real question is what to do if someone calls reverse() on a non-reversible stream.  (Finite streams that are not easily reversible can still be reversed by draining the stream into a buffer, but that's likely inefficient; infinite streams have no chance.)  Should reverse() just throw when it encounters a source not designed for reversibility?  I don't particularly like the idea of throwing based on a dynamic source property, but trying to thread ReversibleStream through the static types is probably worse.

In the "drain" it case, the user can always simulate this manually: call toArray and get a reversed stream from that (since array-based streams are trivially reversible.)

I'm not particularly convinced of the value of folding from both directions.  The main value of foldr over foldl in Haskell is that because of laziness, foldr on infinite lists can work with short-circuiting combiners, whereas foldl cannot.  In strict languages, this benefit is not present, making the marginal value of foldr over foldl much smaller.

But, Stuart's proposal gives us much of this with less fuss.  IF a collection is reversible, you can just say

    c.reversed().stream()

and off you go, plus a similar method for arrays (Arrays.reversedStream).  Given that, what is the value of pushing this further into stream?

On 4/17/2021 7:49 AM, Tagir Valeev wrote:
Great proposal, thanks!

It has most functionality from my previous proposal, except the
ability to iterate LinkedHashSet starting from a specific element.
Well, probably we can skip this for now.

Some people really want to reverse Streams as well. E. g., the
corresponding StackOverflow question [1] has 171 upvotes and 200k+
views. Also, if the stream were reversible, it would be possible to
implement efficient foldRight/reduceRight operation. See the
discussion at [2]. It would be nice to explore the possibility of
extending Stream API to take into account the source reversibility.
Some existing ordered sources that have no underlying collection (e.g.
IntStream.range, or Arrays.stream) can be easily reversed as well. Map
and filter operations preserve reversibility and flatMap is reversible
if the supplied stream is reversible as well.

With best regards,
Tagir Valeev.

[1] https://stackoverflow.com/q/24010109/4856258
[2] https://bugs.openjdk.java.net/browse/JDK-8133680

On Sat, Apr 17, 2021 at 12:41 AM Stuart Marks <stuart.ma...@oracle.com> wrote:
This is a proposal to add a ReversibleCollection interface to the Collections
Framework. I'm looking for comments on overall design before I work on detailed
specifications and tests. Please send such comments as replies on this email 
thread.

Here's a link to a draft PR that contains the code diffs. It's prototype 
quality,
but it should be good enough to build and try out:

      https://github.com/openjdk/jdk/pull/3533

And here's a link to a class diagram showing the proposed additions:


https://cr.openjdk.java.net/~smarks/ReversibleCollection/ReversibleCollectionDiagram.pdf

Thanks,

s'marks


# Ordering and Reversibility


A long-standing concept that's been missing from collections is that of the
positioning, sequencing, or arrangement of elements as a structural property of 
a
collection. (This is sometimes called the "iteration order" of a collection.) 
For
example, a HashSet is not ordered, but a List is. This concept is mostly not
manifested in the collections API.

Iterating a collection produces elements one after another in *some* sequence. 
The
concept of "ordered" determines whether this sequence is defined or whether 
it's a
coincidence of implementation. What does "having an order" mean? It implies that
there is a first element and that each element has a successor. Since 
collections
have a finite number of elements, it further implies that there is a last 
element
that has no successor. However, it is difficult to discern whether a collection 
has
a defined order. HashSet generally iterates its elements in the same undefined
order, and you can't actually tell that it's not a defined order.

Streams do have a notion of ordering ("encounter order") and this is preserved,
where appropriate, through the stream pipeline. It's possible to detect this by
testing whether its spliterator has the ORDERED characteristic. Any collection 
with
a defined order will have a spliterator with this characteristic. However, this 
is
quite a roundabout way to determine whether a collection has a defined order.
Furthermore, knowing this doesn't enable any additional operations. It only 
provides
constraints on the stream's implementations (keeping the elements in order) and
provides stronger semantics for certain operations. For example, findFirst() on 
an
unordered stream is the same as findAny(), but actually finds the first element 
if
the stream is ordered.

The concept of ordering is thus present in the system but is surfaced only in a
fairly indirect way. We can strengthen abstraction of ordering by making a few
observations and considering their implications.

Given that there is a first element and a last element, the sequence of 
elements has
two ends. It's reasonable to consider operations (add, get, remove) on either 
end.
Indeed, the Deque interface has a full complement of operations at each end. 
This is
an oft-requested feature on various other collections.

Each element except for the last has a successor, implying that each element 
except
for the first has a predecessor. Thus it's reasonable to consider iterating the
elements from first to last or from last to first (that is, in forward or 
reverse
order). Indeed, the concept of iterating in reverse order appears already in 
bits
and pieces in particular places around the collections:

   - List has indexOf() and lastIndexOf()
   - Deque has removeFirstOccurrence() and removeLastOccurrence()
   - List has a ListIterator with hasPrevious/previous methods
   - Deque and NavigableSet have descendingIterator methods

Given an ordered collection, though, there's no general way to iterate it in 
reverse
order. Reversed iteration isn't the most common operation, but there are some
frequent use cases, such as operating on elements in most-recently-added order.
Questions and bug reports about this have come up repeatedly over the years.

Unfortunately, iterating in reverse order is much harder than iterating in 
forward
order. There are a variety of ways to iterate in forward order. For example, 
given a
List, one can do any of the following:

      for (var e : list) { ... }
      list.forEach(...)
      list.stream()
      list.toArray()

However, to iterate a list in reverse order, one must use an explicit loop over
ListIterator:

      for (var it = list.listIterator(list.size()); it.hasPrevious(); ) {
          var e = it.previous();
          ...
      }

Streaming the elements of a List in reverse order is even worse. One approach 
would
be to implement a reverse-ordered Iterator that wraps a ListIterator and 
delegates
hasNext/next calls to the ListIterator's hasPrevious/previous methods. Then, 
this
Iterator can be turned into a Spliterator, which is then turned into a Stream. 
(The
code to do this is left as an exercise for the reader.) Obtaining the elements 
in
reverse via other means -- forEach() or toArray() -- is similarly difficult.

Obtaining the elements in reverse order can be accomplished by adopting a 
concept
from NavigableSet's descendingSet() method, which provides a reverse-ordered 
view
collection. This view is also a NavigableSet, which means that any operation 
that
can be performed on the original set can also be applied to the reverse-ordered
view. If it were possible to obtain a similar reverse-ordered view on any kind 
of
ordered collection, this would enable the elements to be processed in reverse 
order
in any fashion, not just special-purposes APIs such as ListIterator or
descendingIterator().


# LinkedHashSet


The main feature of LinkedHashSet is that it does have a defined ordering, as
opposed to HashSet, which does not. LinkedHashSet clearly has a first and a last
element. However, LinkedHashSet is deficient in a number of ways:

   - access to and removal of the first element is reasonable (using an 
iterator) but
it is not possible to add at the front

   - it is possible to add an element at the end, but access to and removal of 
the
last element are expensive

   - it's not possible to iterate in reverse without copying the entire 
collection

Most frustratingly, LinkedHashSet is implemented using a doubly-linked list, so
there is no fundamental implementation reason that prevents these operations 
from
being supported. The main reason these operations aren't supported is probably 
that
there hasn't been a good place to push such APIs.


# Proposal


Introduce a new ReversibleCollection<E> interface. This provides a new method to
obtain a reverse-ordered view. It also contains several methods (copied from 
Deque)
that allow operating on elements at both ends.


      interface ReversibleCollection<E> extends Collection<E> {
          ReversibleCollection<E> reversed();
          default void addFirst(E e)
          default void addLast(E e)
          default E getFirst()
          default E getLast()
          default E removeFirst()
          default E removeLast()
      }


The List, Deque, and SortedSet interfaces, and the LinkedHashSet class are
retrofitted to implement ReversibleCollection. They provide covariant overriding
implementations of the reversed() method. For example, List.reversed() returns a
List. Default implementations for all of these methods are provided in the
appropriate interfaces.

Covariant overrides are also provided in several other classes. This presents a
difficulty for LinkedHashSet, as there's no suitable return type for 
reversed(). To
remedy this, we add another interface


      interface ReversibleSet<E> extends ReversibleCollection<E>, Set<E> { }


This adds no new methods but is essentially an intersection of 
ReversibleCollection
and Set. As such, it's suitable as the return type of LinkedHashSet.reversed() 
and
the set views of LinkedHashMap.

SortedSet::addFirst and addLast throw UnsupportedOperationException. This is 
because
SortedSet's ordering is determined by the comparison method and cannot be 
controlled
explicitly.

LinkedHashSet::addFirst and addLast add the element at the appropriate position 
or
reposition the element if it is already present in the set.

New methods are added to LinkedHashMap

      V putFirst(K, V)
      V putLast(K, V)

which put the mapping at the designated position or reposition the mapping if is
already present in the map.


# Design & Implementation Issues


Covariant overrides for reversed() are provided where possible. However, there 
is a
conflict between List and Deque, as there are examples in the JDK (such as
LinkedList) that implement both List and Deque. Since List is much more popular 
than
Deque, I've decided against adding a covariant override to Deque. Instead, a 
method
Deque::reversedDeque is added that returns a Deque, and Deque::reversed returns
ReversibleCollection.

There is no ReversibleMap interface. Most iteration over maps is over the 
entrySet,
keySet, or values views. NavigableMap already has descendingMap(), and 
LinkedHashMap
provides ReversibleX views, which cover most of the cases already.

Introducing new methods into an interface hierarchy always raises the 
possibility of
name clashes. I've done some searching but I haven't found any major issues, 
but we
should be on the lookout for this. I'll continue to do more searching for such
conflicts.

Introducing covariant overrides on existing methods (notably LinkedHashMap 
entrySet,
keySet, and values) could present incompatibility issues with subclasses that
override these methods. I've done some searching and again I haven't found major
problems, but this is nonetheless a point of concern. I'll do more searching 
here, too.

The default implementations for List::reversed, Deque::reversed, and
SortedSet::reversed return reverse-ordered views that are implemented using only
public methods of the original interface. This demonstrates the feasibility of
retrofitting reverse ordering onto any instance of these interfaces. Similarly, 
the
various add/get/remove methods' default implementations are all implementable in
terms of an iterator or reverse-ordered iterator. That said, some concurrent
collection implementations will need to override these default implementations 
in
order to preserve their safety invariants. It would probably also be wise to add
optimized implementations to the more commonly-used collection implementations.

Given that a ReversibleCollection has a defined ordering, any spliterator 
obtained
from such a collection should have the Spliterator.ORDERED characteristic.

This proposal is related to a previous discussion on a similar topic, where 
Tagir
Valeev had proposed OrderedSet and OrderedMap:

      
http://mail.openjdk.java.net/pipermail/core-libs-dev/2020-April/066028.html

I think there were some good ideas in there and in the ensuing discussion, but 
it
didn't go anywhere. Several of the concepts in this proposal are an evolution of
some ideas discussed in that thread.

One of the ideas from that thread was to use Deque as a common interface for 
many
collections to support reversibility and operations at the ends. We've abandoned
that idea in this proposal. The reasons are that it's cluttered with a bunch of
methods that are inherited from Queue. Also, some methods are null-returning 
instead
of throwing, which introduces confusion for collections that can contain nulls.

# END

Reply via email to