And testing what happens if the comparator throws an exception on a different thread.

In this case, we want all FJ tasks to stop ASAP. But the Java library code may be written in a way that handles more general cases.

    Andy

On 16/07/16 00:41, Stian Soiland-Reyes wrote:
I think you are right.. as a sorting an array would probably not be doing
any interruptible/blocking operations and the thread would keep sorting,
and so basically you would then need a Comparator that checks the thread
interruption status and bails out. :-)

So I guess Chris's suggestion is a good approach. Testing possible
performance improvements on big arrays with parallelSort would be separate
task, that can still use the same Comparator (but with a slightly different
exception pattern and would then need to avoid the non-thread-safe internal
counter).

On 15 Jul 2016 11:54 p.m., "Andy Seaborne" <[email protected]> wrote:

On 15/07/16 15:10, Stian Soiland-Reyes wrote:

First of all the sorting could be sped up significantly on a
multi-core machine by using Java 8's:

http://docs.oracle.com/javase/8/docs/api/java/util/Arrays.html#parallelSort-T:A-

(The speed-up can in some cases be much larger than the multiple of
the number of cores)


..sadly the underlying ArraysParallelSortHelpers.FJObject.Sorter is
not publicly accessible, because then its ForkJoinTask could be used
to cancel an ongoing sort.

What about instead using a FutureTask which you cancel with
future.cancel(true),



http://stackoverflow.com/questions/11158454/future-task-of-executorservice-not-truly-cancelling

the task would just call Arrays.parallelSort().


and may not get cancelled :-(


The array might be in a weird state if you cancel/interrupt mid-sort
(e.g. duplicate values) - but on cancelling you can clear/forget the
array. And by keeping the array sorting in a separate thread you won't
be interrupting something within Jena's internal (e.g. a TDB index
update)



On 15 July 2016 at 12:08, Chris Dollin <[email protected]>
wrote:

Dear All

When a query with an ORDER BY is cancelled, the component
Arrays.sort() that sorts the chunk(s) of the result
bindings runs to completion before the cancel finishes.
[See QueryIterSort and SortedDataBag.]

For a large result set, this results in a long wait
before the cancelled request finally finishes. This
can be inconvenient.

The cancel request can be sneaked into the sort by
way of the comparator [1] and adding an instance
variable `cancelled` to SortedDataBag, set `true`
from QueryIterSort.requestCancel(). The comparator
checks `cancelled` and if it has become `true`
throws an exception, which is then caught outside
the call to Arrays.sort(), abandoning the sort.
See attached diff.

Questions arising:

* is it safe to abandon a sort from inside a comparator?
    (can't see anything that suggests otherwise.)

* are there threading issues that have to be deal with
    other than by making the `cancelled` flag volatile?

If what I suggest appears to be sane I'll make it a
   pull request and run the process.

Chris

[1] Using a wrapper to handle the test for cancellation
      and then delegating `compare` to the comparator
      supplied to SortedDataBag.












Reply via email to