Re: RFR: 8284050: [vectorapi] Optimize masked store for non-predicated architectures [v2]

2022-05-05 Thread John Rose


> On May 4, 2022, at 8:29 PM, Xiaohong Gong  wrote:
> 
> The offset check could save the `checkMaskFromIndexSize`  for cases that 
> offset are in the valid array bounds, which also improves the performance. 
> @rose00 , do you think this part of change is ok at least?

That part is ok, yes. I wish we could get the same effect with loop 
optimizations but I don’t know an easy way. The explicit check in the source 
code gives the JIT a crutch but I hope we can figure out a way in the future to 
integrate mask logic into range check elimination logic, making the crutches 
unnecessary. For now it’s fine. 

Re: RFR: JDK-8282798 java.lang.runtime.Carrier [v11]

2022-03-21 Thread John Rose
On 21 Mar 2022, at 11:05, ExE Boss wrote:

> Actually, this lookup object should probably be kept cached.

+1.  Good “cache”.


Re: RFR: JDK-8282798 java.lang.runtime.Carrier

2022-03-08 Thread John Rose
On 8 Mar 2022, at 11:53, Jim Laskey wrote:

>> I'm not sure I see anything here that would prevent the user from building a 
>> carrier and casting to Object[] ?

> Clarified the comment. You are right. When I reworked the original code I 
> dropped the intermediate object. For my purposes, it was about performance. 
> If the object needs to be protected then it's up to the client to protect.  
> The same would be true if I add byte[] for all binary carriers.

I would prefer protecting the Object[] inside a non-public-class envelope 
object.

It’s a performance hit, but only on a path (no class spun b/c a high number of 
components) where there is no detailed optimization strategy anyway.

For the less-optimized path, putting everything into an Object[] (boxing 
primitives as well?) suggests that performance (+ footprint) is not an issue 
anyway.  In that case, the extra cost of adding the envelope object 
(indirection cost + footprint) not be a unique cost.

The best reason to put an envelope around the Object[] array (or other 
user-crackable representation) is the same reason we randomize encounter order 
in Set.of and Map.of.  It’s extra work, but it prevents users from building in 
out-of-contract dependencies in their code.  If this API exposes Object[] 
values, even temporarily, is is possible (even likely) that users will notice 
and build in out-of-contract dependencies.  The sad result of “optimizing” the 
Object[] representation by omitting the envelope will be that further 
optimizations (real ones!) will be harder to get accepted.

Summary:  Put the envelope around the Object, on paths which use Object[].

Second suggestion, as a follow-on task for later, is add a long[] array into 
that same envelope and use the long[] array for all primitive values, instead 
of boxing them as Integer, Long, etc.  (Or put the long[] array as the first 
item in the Object[] array.)  It probably doesn’t matter, but 32-bit values 
could be stored in high or low halves of the long elements.  (I think I’m not 
the only one to prototype it that way; I’ll bet Remi did something similar.)


Re: RFR: 8280124: Reduce branches decoding latin-1 chars from UTF-8 encoded bytes

2022-01-18 Thread John Rose
It’s kind of sad that you have to do this by hand.
The JVM should recognize that those back-to-back
if statements (in the bytecode) can be combined
into a non-branching O(1) test.

One item on my (very long) wish list for the JVM
is giving more comprehensive attention to the task
of testing membership in small finite sets.  This
one asks whether a given byte is one of two values,
where they happen to be consecutive and also have
almost the same bit pattern.  A more general version
of the problem would be any switch of this form:

switch (x) {
case 2: case 3: case 5: case 7: case 16:
default: return false;
}

As you can see, this is again a test for membership
in a small, statically determined set.  As long as
the static set is “small enough” (a flexible definition)
there is always a way that is “very fast” (flexible again)
to detect set membership, and in particular faster than
a branch tree (unless you know which branch is very likely
to be taken).  The “common bad” case is having to generate
a small perfect hash function, based on a multiply.  The
best case is an ad hoc detection of an arithmetic or
bitwise pattern, as you have done here by hand.

If the JVM had such a set-test generator, it could (a) turn
branch nests (from switches or from random user code) into
O(1) set tests, and (b) also supply a method handle factory
for them, which would be a very useful way to translate
indy-mediated discrimination trees that underly
pattern-matching switches.  Also it could translate
character class tests in the regex package.  (I started
thinking about this long ago when staring at hand-written
switches in Java code that implement character classes.)

— John

On 18 Jan 2022, at 2:44, Claes Redestad wrote:

> On Tue, 18 Jan 2022 10:08:35 GMT, Claes Redestad  wrote:
>
>> This resolves minor inefficiency in the fast-path for decoding latin-1 chars 
>> from UTF-8. I also took the opportunity to refactor the StringDecode 
>> microbenchmark to align with recent changes to the StringEncode micro.
>>
>> The inefficiency is that this test is quite branchy:
>>
>> `if ((b1 == (byte)0xc2 || b1 == (byte)0xc3) && ...`
>>
>> Since the two constant bytes differ only on the lowest bit this can be 
>> transformed to this, saving us a branch:
>>
>> `if ((b1 & 0xfe) == 0xc2 && ...`
>>
>> This provides a small speed-up on microbenchmarks where the input can be 
>> internally encoded as latin1:
>>
>>
>> Benchmark (charsetName) Mode Cnt Score Error Units
>> StringDecode.decodeLatin1LongStart UTF-8 avgt 50 2283.591 ± 12.332 ns/op
>>
>> StringDecode.decodeLatin1LongStart UTF-8 avgt 50 2165.984 ± 13.136 ns/op
>
> On a microbenchmark that zooms in on the logical predicate the speed-up is 
> closer to 2x. This seems like a transformation a JIT could do automatically. 
> gcc and clang doesn't do it, but icc seem to pull it off (as tested via 
> godbolt.org). It's unclear if this is common enough to motivate such 
> enhancement work, but it might be of academic interest to attempt it.
>
> -
>
> PR: https://git.openjdk.java.net/jdk/pull/7122


Re: [jdk18] RFR: 8278607: Misc issues in foreign API javadoc

2021-12-13 Thread John Rose
Nice work.  I like this API a lot, and the doc is getting a good clear 
shine on it.


The run-on sentence at the top of `Foreign addresses` needs a 
bit more polish.  The long parenthetical comment still reads awkwardly.  
Maybe it doesn’t need parentheses any more, now that it’s part of 
(or all of) its own small sentence?


On 13 Dec 2021, at 13:20, Maurizio Cimadamore wrote:

This patch fixes a number of issues and typos in the foreign API 
javadoc following another internal round of reviews.


-

Commit messages:
 - Misc foreign API javadoc fixes

Changes: https://git.openjdk.java.net/jdk18/pull/17/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk18=17=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8278607
  Stats: 35 lines in 7 files changed: 3 ins; 1 del; 31 mod
  Patch: https://git.openjdk.java.net/jdk18/pull/17.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk18 
pull/17/head:pull/17


PR: https://git.openjdk.java.net/jdk18/pull/17


Re: RFR: 8274412: Add a method to Stream API to consume and close the stream without using try-with-resources

2021-10-11 Thread John Rose
So the purpose of TWR is to hold an object with a “close-debt”
(debt of a future call to close) and pay it at the end of a block,
sort of like C++ RAII (but also sort of not).

But fluent syntaxes (which I like very much and hope to see
more of in the future!) don’t play well with blocks, so if a
fluent chain (any part of that chain:  It’s multiple objects)
incurs a “close-debt”, it’s hard to jam a TWR block into it.

Hence the current proposal.  I agree with Brian and Paul
that we haven’t examined all the corners of this problem
yet.  And I’d like to poke at the concept of “close-debt” to
help with the examination.

Just for brain storming, I think we could model “close-debt”
outside either fluent API structure or TWR block structure.
Java O-O APIs are the pre-eminent way to model things in
Java, and they work exceedingly well, when used with skill.

AutoCloseable models close-debt of course.  But it has two
weaknesses:  It doesn’t model anything *other* than the
debt, and its (sole) method skates awkwardly around the
issue of checked exceptions.  (It requires an override with
exception type narrowing to be used in polite company.)
AC is more of an integration hook with TWR, rather than
a general-purpose model for close-debt.  Therefore it doesn’t
teach us much about close-debt in a fluent setting.

Surely we can model close-debt better.  Let’s say that an
operation (expression) with close-debt *also* has a return
value and (for grins) *also* has an exception it might throw.
This gets us to an API closer to Optional.  (If you hear the
noise of a monad snuffling around in the dark outside
your tent, you are not wrong.)

interface MustClose_1 {
   T get() throws X;  //or could throw some Y or nothing at all
   void close() throws X;
}

(I wish we had an easier way to associate such an X
with such a T, so that Stream could be more
interoperable with simple Stream.  It’s a pain to
carry around those X arguments.  So I’ll drop X now.)

interface MustClose_2 {
   T get();
   void close() throws Exception;
}

An expression of this type requires (in general) two
operations to finish up:  It must be closed, and the result
(type T) must be gotten.  There’s an issue of coupling between
the two methods; I would say, decouple their order, so that
the “get” call, as with Optional, can be done any time,
and the “close” call can be done in any order relative
to “get”.  Both calls should be idempotent, I think.
But that’s all second-order detail.

A first-order detail is the apparent but incorrect 1-1 relation
between T values and close-debts.  That’s very wrong;
a closable stream on 1,000 values has one close-debt,
not 1,000.  So maybe we need:

interface MustClose_3 {
   S map(Function value);
   void close() throws Exception;
}

That “map” method looks a little like Remi’s apply
method.  Did I mention this design requires skill
(as well as flexibility, with one hand already tied
by checked exceptions)?  I’m at the edge of my own
skill here, but I think there’s good ground to explore
here.

In a fluent setting, a Stream that incurs a close-debt
might be typed (after incurring the debt, perhaps in a
transform) as Stream>, and somehow
all consumers of the MustClose, such as map and
collect operations on the Stream, would correctly
unpack each T from the MC, and then repack
the result into the MC<.> wrapper.

var res = openAFileStream().map(…).collect(…);

Here the first method call returns a Stream with
close-debt mixed into its type.  The map and collect
calls would wire both parts:  The T values flowing
through, and the close-debt.  Who takes responsibility
for paying the close debt?  Maybe an extra call
at the end:  …map(…).collectAndClose(…).
Or maybe the stream “knows” internally that since
its type has a close debt, all of its terminal operations
have to pay off that debt as they collect the payloads.
So it would be automatic, somehow, inside of
collect, forEach, etc.

To make the parts hook up right, you might need
reified generics, or maybe an amended type
MustCloseStream <: Stream>,
like the LongStream <: Stream we dislike.
I’m only proposing as a thought exercise for now.

Maybe the MustCloseStream takes an explicit close
method which produces a regular stream over the
base type.  The explicit close method would release
resources and buffer anything necessary to produce
an in-memory Stream.  You’d want to call it
late in the fluent chain, after filtering and flat-mapping
is done, just before a collect for forEach.

Here’s a streamlined version of MustClose that
I like, which sidesteps the problem of mutual ordering
of two methods:

interface MustClose_4 {
   R getAndClose() throws Exception;
   default void close() throws Exception { getAndClose(); }
}

Here, R is not an element type of a stream, but rather
the final result (maybe Void) of some terminal operation.

Such an interface could interact with syntax, too.  For
example, it might help with TWR expressions (if we
wanted to think about that):

var res 

Re: RFR: 8273656: Improve java.lang.invoke.MethodType.parameterList() and its usage

2021-09-21 Thread John Rose
On Sep 13, 2021, at 10:24 AM, Vladimir Ivanov 
mailto:vliva...@openjdk.java.net>> wrote:

BTW it can be improved even further by caching the immutable List view of 
parameters.

I would go further:  If I were writing MethodType.java today
I would probably use List.of as the backing store for the
parameters, instead of the private array.  So ptypes should
be List> not Class[].  I don’t think the footprint
or polymorphism effects would be dealbreakers, and the
code would (I think) be simpler overall.  But that’s a messy
change, since the array representation is dug in.



Re: better random numbers

2021-09-17 Thread John Rose
On Sep 7, 2021, at 4:48 AM, Stefan Zobel 
mailto:splitera...@gmail.com>> wrote:

That "influential researcher" is probably Sebastiano Vigna who has indeed
harsh words on PCG: https://pcg.di.unimi.it/pcg.php

That link can also be found on ONeill’s blog, along with her
responses.

https://www.pcg-random.org/posts/on-vignas-pcg-critique.html



Re: better random numbers

2021-09-06 Thread John Rose
On Sep 5, 2021, at 3:23 PM, John Rose 
mailto:john.r.r...@oracle.com>> wrote:

To increase throughput use vectors or generate more than one random sample per 
crank turn. But back to back aes steps are probably always twice the latency of 
a single wide multiply. So I think there might be some more room for cleverly 
using single aes rounds, say two in parallel with 1-cycle input transforms. Put 
simply, engineers and academics approach novelty differently.

(Yikes!  I typed the mail on my phone and somehow that last sentence
went to the wrong place.  It should have been at the end of the PP that
speculated about differing motivations surrounding publication
and engineering.)


Re: better random numbers

2021-09-05 Thread John Rose
On Sep 5, 2021, at 7:44 AM, Andrew Haley  wrote:
> 
> On 9/3/21 12:35 AM, John Rose wrote:
> 
>> The reference I’d like to give here is to Dr. Melissa O’Neill’s
>> website and articles:
> 
> I'm quite sceptical. Anyone who says a (non-cryptographic) random-
> number generator is "hard to predict" is either quite naive or in a
> state of sin, (;-) and while O’Neill’s argument seems sound, it
> doesn't seem to have convinced the academic world.

She meets one of these objections on her blog. But IMO there’s not much that 
can be said here, since “hard” is hard to define. It’s a practical question to 
be addressed (tentatively) by practical arguments. 

> 
> Lemire is thoughtful:
> https://lemire.me/blog/2017/08/15/on-melissa-oneills-pcg-random-number-generator/

That’s a good link thanks. 
> 
> I wonder about AES, which can do (on Apple M1) 2 parallel rounds per
> clock cycle. I'm quite tempted to try a reduced- round AES on the
> TestU01 statistical tests. Maybe 6 rounds?

I think I’ve mentioned using aes as a mix step at one or two JVMLS meetings. I 
think off label  low-round use of HW accelerated crypto steps is a good use, to 
build sub-cryptographic but “hard” (that word!) prngs and hashes. 

I expect a 128-bit multiply has a comparable amount of cascading with two aes 
rounds; probably the aes does a better job since it has no weak lanes. At least 
it looks that way on paper. So I’d start with 2 rounds and try various ways to 
replace a big multiply with the aes.

To increase throughput use vectors or generate more than one random sample per 
crank turn. But back to back aes steps are probably always twice the latency of 
a single wide multiply. So I think there might be some more room for cleverly 
using single aes rounds, say two in parallel with 1-cycle input transforms. Put 
simply, engineers and academics approach novelty differently. 

The data driven ror/lsl is less valuable as an output transform since it 
doesn’t need to cover a structural problem with the main advance function.  
Xoro and pcg are drastically linear so they need a nonlinear mix function 
somewhere. 

That said, aes probably has its own odd quirks and regularities, at very low 
round numbers. one reason I like LCG methods is they use very well understood 
basic components. Which means if they have a flaw (in the low bits) it’s easier 
to account for and doesn’t require as much original research. So I prefer LCG 
over XOR networks and AES because it’s been examined more carefully.

Perhaps the academic publishing world has an opposite value function: if you 
are using old research in a new way to solve problems they don’t care as much 
as if there is a new technique. After all, finding the flaws in something new 
will fill new journal slots and get more traffic. 


> However, there can be a
> long latency between FPU and integer CPU, so perhaps it's not such a
> great idea.

That’s why I’d consider batching. 

> Also, you have to load the key registers before you can
> generate a random number, so it only really works if you want to
> generate a lot of bits at a time.

Yes, that. 

> But it is maybe 128 randomish bits
> per a few clock cycles.

That’s the attraction. But you can do similar things with vectorized 64-bit 
operations. As long as you can clean up the weak low bits you have good mixing 
of state. 

Thanks for the comments. 
> 
> -- 
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
> 


Re: better random numbers

2021-09-02 Thread John Rose
On Sep 2, 2021, at 4:35 PM, John Rose 
mailto:john.r.r...@oracle.com>> wrote:

The state of the art for PRNGs (pseudo-random number generators) is
much advanced since ju.Random was written.

Surely at some point we will refresh our APIs that produce random
numbers.  In fact, we have added SplittableRandom, but I think the
state of the art is farther enough along to consider another refresh.

Well, I didn’t know about https://openjdk.java.net/jeps/356
when I wrote the previous, so I’m a little late to be as helpful
as I hoped to be.  Still, Joe and Brian B. (thanks guys for handing
me the Missing Clue) lead me to hope we can add options to cover
the PCG algorithms, if (as I think) they turn out to be useful.

The 128-bit PCG state with the 64-bit multiplier of da942042e4dd58b5
looks especially promising to me.  It’s used here:

https://github.com/imneme/pcg-cpp/blob/ffd522e7188bef30a00c74dc7eb9de5faff90092/include/pcg_random.hpp#L176

O’Neill says that’s her PRNG of choice for most purposes is
pcg64_fast (defined in the file above using that multiplier).
It competes better with the xoshiro family performance
than the shootout figures provided by the xoshiro developers
would seem to indicate.  And the state size for pcg64_fast is
128 bits which (I think) is about half the size of a xoshiro state.



better random numbers

2021-09-02 Thread John Rose
The state of the art for PRNGs (pseudo-random number generators) is
much advanced since ju.Random was written.

Surely at some point we will refresh our APIs that produce random
numbers.  In fact, we have added SplittableRandom, but I think the
state of the art is farther enough along to consider another refresh.

(When we get advanced numerics via Valhalla, I think we will probably
want generic algorithms on 128-bit int and vectorized states, as well;
that’s just a guess.)

I bring this up not for any immediate purpose, but because every so
often I google around for information about PRNGs, and I have noticed
that a particular reference (since at least 2018) keeps coming to the
top of the pile of interesting stuff.  I’d like to share it with you
all, and talk about making use of it.

(FWIW, last weekend I was doing some experiments with perfect hash
functions, and I wanted to roll my own PRNG to drive the search for
them.  A few months ago my problem was test vector generation, and
again I rolled my own PRNG after looking at ju.*Random.)

The reference I’d like to give here is to Dr. Melissa O’Neill’s
website and articles:

https://www.pcg-random.org

…Read more here:

http://cr.openjdk.java.net/~jrose/jdk/pcg-psa.txt

Re: [External] : Re: RFR: 8199318: add idempotent copy operation for Map.Entry

2021-06-03 Thread John Rose
On Jun 3, 2021, at 3:17 PM, fo...@univ-mlv.fr<mailto:fo...@univ-mlv.fr> wrote:




De: "John Rose" mailto:r.r...@oracle.com>>
À: "Remi Forax" mailto:fo...@univ-mlv.fr>>
Cc: "Peter Levart" mailto:peter.lev...@gmail.com>>, 
"Rémi Forax" mailto:fo...@openjdk.java.net>>, 
"core-libs-dev" 
mailto:core-libs-dev@openjdk.java.net>>
Envoyé: Jeudi 3 Juin 2021 22:51:28
Objet: Re: RFR: 8199318: add idempotent copy operation for Map.Entry
...

interface ComparableRecord>
extends Comparable { … }

record Foo(int x, String y) implements ComparableRecord { … }

http://cr.openjdk.java.net/~jrose/draft/ComparableRecord.java

[repost with a link]

The main issue with this kind of code is that the JIT does not see through the 
ClassValue.

Wow, it’s almost as if we’ve discussed this already! ;-)
So, https://bugs.openjdk.java.net/browse/JDK-8238260
Part of my agenda here is finding more reasons why
JDK-8238260 deserves some love.

Currently, the translation strategy for Records
requires a lot of boilerplate generated into each
subclass for toString, equals, and hashCode.
It is partially declarative, because it uses indy.
So, yay for that.  But it is also a bad smell (IMO)
that the trick needs to be played in each subclass.

If ClassValue were performant, we could have
just one method in Record for each of toString,
equals, and hashCode, and have them DTRT.
The user code would then be able to call
super.toString() to get the standard behavior
in a refining wrapper method in a subclass.

Looking further, it would be nice to have methods
which (a) inherit from supers as reusable code
ought to, but (b) re-specialize themselves once
per subclass indy-style.  This is a VM trick I hope
to look into after we do specialization.  For now,
ClassValue is the way to simulate that.


Tweaking a little bit your code, I get
https://gist.github.com/forax/e76367e1a90bf011692ee9bec65ff0f8<https://urldefense.com/v3/__https://gist.github.com/forax/e76367e1a90bf011692ee9bec65ff0f8__;!!GqivPVa7Brio!MheW9rHDHXlXYNKUju7v5G0vXlpj1YOoDWFjG9AcpnXnVz2TxlMYDM7i86yFtT_B$>

(It's a PITA that we have to use a raw type to workaround circularly defined 
parameter type)

I tried to DTRT and got into Inference Hell, surrounded
by a swarms of unfriendly wildcards with pitchforks.
Glad it wasn’t just me.

Inspired by your more whole-hearted use of streams
to build the code, I updated my example.  Thanks.

— John


Re: RFR: 8199318: add idempotent copy operation for Map.Entry

2021-06-03 Thread John Rose
On Jun 3, 2021, at 12:46 PM, Remi Forax 
mailto:fo...@univ-mlv.fr>> wrote:

I kind of regret that the compiler does not provide automatically an 
implementation of compareTo if the record implements Comparable.
People sucks at writing compareTo and the resulting bugs are hard to 
find/reproduce.

That’s a slippery slope.  IIRC we consciously stopped
before that step.

That said, there are other ways to fix this.  We should
have utilities (maybe in the JDK but not the JLS) which
build such methods and make it easy for users to grab onto
them.  Maybe something like this:

interface ComparableRecord>
extends Comparable { … }

record Foo(int x, String y) implements ComparableRecord { … }

http://cr.openjdk.java.net/~jrose/draft/ComparableRecord.java

— John


Re: RFR: JDK-8267936: PreserveAllAnnotations option isn't tested

2021-06-01 Thread John Rose
On Jun 1, 2021, at 7:02 AM, Brian Goetz 
mailto:brian.go...@oracle.com>> wrote:

As Alan's archaeology discovered, this flag appears to be a leftover from the 
original implementation, and I could find no signs of its usage.

Note to self and other reviewers of future versions
of the JVMS: When you see proposed language like
the “unless…” of JVMS 4.7.17, remember today's
conversation and try to avoid putting dark corners
like that into the JVMS.

The RuntimeInvisibleAnnotations attribute is similar to the 
RuntimeVisibleAnnotations attribute (§4.7.16), except that the annotations 
represented by a RuntimeInvisibleAnnotations attribute must not be made 
available for return by reflective APIs, [[WE SHOULD HAVE STOPPED HERE]] unless 
the Java Virtual Machine has been instructed to retain these annotations via 
some implementation-specific mechanism such as a command line flag. In the 
absence of such instructions, the Java Virtual Machine ignores this attribute.
https://docs.oracle.com/javase/specs/jvms/se16/html/jvms-4.html#jvms-4.7.17


Re: [External] : Re: RFR: 8263087: Add a MethodHandle combinator that switches over a set of MethodHandles

2021-04-09 Thread John Rose
On Apr 9, 2021, at 4:00 PM, John Rose 
mailto:john.r.r...@oracle.com>> wrote:

The MH combinator for lookupswitch can use a data-driven
reverse lookup in a (frozen/stable) int[] array, using binary
search.  The bytecode emitter can render such a thing as
an internal lookupswitch, if that seems desirable.  But
the stable array with binary search scales to other types
besides int, so it’s the right primitive.

This may be confusing on a couple of points.
First, the mapping function I’m talking about is not
a MH combinator, but rather a MH factory, which takes
a non-MH argument, probably an unsorted array or List
of items of any type.  It then DTRT (internal hash map
or tree map or flat binary search or flat table lookup or
lookupswitch or any combination thereof) to get
an algorithm to classify the array or List elements
into a compact enumeration [0,N).

Second, when the input array or List is of int (or
Integer) then it *might* be a lookupswitch internally,
and I’m abusing the terminology to call this particular
case a lookupswitch.  But it’s really just a classifier
factory, whose output is a MH of type K → [0,N) for
some K.  The output might also be ToIntFunction
for all I care; that can be inter-converted with a MH.






Re: [External] : Re: RFR: 8263087: Add a MethodHandle combinator that switches over a set of MethodHandles

2021-04-09 Thread John Rose
On Apr 9, 2021, at 11:15 AM, fo...@univ-mlv.fr wrote:
> 
> - Mail original -
>> De: "John Rose" 
>> À: "Remi Forax" 
>> Cc: "Jorn Vernee" , "core-libs-dev" 
>> 
>> Envoyé: Vendredi 9 Avril 2021 20:01:18
>> Objet: Re: RFR: 8263087: Add a MethodHandle combinator that switches over a 
>> set of MethodHandles
> 
> Hi John,
> 
>> On Apr 9, 2021, at 9:55 AM, Remi Forax  wrote:
>>> 
>>> I think the combinator should be lookupswitch which is more general than
>>> tableswitch with a special case when generating the bytecode to generate a
>>> tableswitch instead of a lookupswitch if the indexes are subsequent.
>> 
>> We can get there in the simpler steps Jorn has outlined.
> 
> I fail to see how it can work.

If you have a fixed set of N cases it is always valid
to number them compactly in [0,N).  If there is
another association of keys in some set K to cases,
then you simply build a mapping K → [0,N).  That
works for enums, lookupswitch, strings, and patterns.
The mapping functions would be:
  - Enum::ordinal
  - a lookupswitch of cases `case i: return n`, n in [0,N]
  - some perfect hash, composed with lookupswitch
  - some decision tree that outputs compact case numbers

In the second case, C2 will inline the lookupswitch and
tableswitch together and do branch-to-branch tensioning.
The result will be the same as if the intermediate tableswitch
had not been used.

The MH combinator for lookupswitch can use a data-driven
reverse lookup in a (frozen/stable) int[] array, using binary
search.  The bytecode emitter can render such a thing as
an internal lookupswitch, if that seems desirable.  But
the stable array with binary search scales to other types
besides int, so it’s the right primitive.

The SwitchBootstraps class is the place to match a
static decision tree or decision chain (of N cases) of
an arbitrary shape to compact case labels in [0,N).
Then they all feed to Jorn’s primitive.

> 
>> 
>> The combinator is much simpler if the case numbers are implicit in [0,N). 
>> Then
>> it’s natural to filter on the [0,N) input as a separately factored choice.
> 
> An array of MethodHandles + a default method handle is simpler than an array 
> of sorted ints + an array of MethodHandles + a default method, but not much 
> simpler.

Simpler by the complexity of the sorting, which is a sharp edge.
The type “sorted int array without duplicates and unaliased
or frozen” is pretty involved.  Easy to make mistakes with it.

> 
>> That also scales to pattern-switch.
> 
> yes, for all the switches, pattern-switch, enum-switch but not for the string 
> switch which requires a lookup switch.

Nope; see above.

> Can you outline how to use the tableswitch combinator in the case of a switch 
> on strings ?

Above; reduce to perfect hash plus lookupswitch
producing compact int values to feed to a tableswitch.

Summary:  Switches only need one case label domain:  [0,N).
Everything else is case label mappings.

— John

Re: RFR: 8263087: Add a MethodHandle combinator that switches over a set of MethodHandles

2021-04-09 Thread John Rose
On Apr 9, 2021, at 9:55 AM, Remi Forax  wrote:
> 
> I think the combinator should be lookupswitch which is more general than 
> tableswitch with a special case when generating the bytecode to generate a 
> tableswitch instead of a lookupswitch if the indexes are subsequent.

We can get there in the simpler steps Jorn has outlined.

The combinator is much simpler if the case numbers are implicit in [0,N). Then 
it’s natural to filter on the [0,N) input as a separately factored choice. That 
also scales to pattern-switch. 

I agree with the choice to have N call sites. It’s possible to build the one 
call site version on top using constant combinators but not vice versa. 


Re: RFR: 8180352: Add Stream.toList() method

2020-11-03 Thread John Rose
I agree with Stuart and Brian that streams should be null-tolerant,
including in this case.  And of course I’m delighted to see an
immutable container as the output to the utility method; I’m
a fan of reduced-copy, race-free, bug-resistant algorithms you
get from immutability.

On Nov 3, 2020, at 6:53 AM, Remi Forax  wrote:
> 
> Also, adding a third immutable list creates a problem, it means that now when 
> we get an immutable list it can be 3 different implementations but the VM 
> only uses bimorphic inlining cache,
> so more callsites will fail to inline because of that. I think we have 
> already reduced the number of implementation of immutable map from 3 to 2 for 
> the very same reasons.

Yes, this part concerns me, with my JIT hat on.

It seems to me that the problem can be removed simply by allowing
the existing ListN object to contain nulls.  That’s not the same
thing as allowing List.of(1,2,3, null):  That factory method can
reject nulls, while the privileged factory method that makes
an array-backed ListN can simply sidestep the null check.

And if a call to Stream::toList call returns 0, 1, or 2 items,
it’s a straightforward matter to test if either is null, and then
choose to use either ListN or List12.  No new implementation
classes, at all!

Re: Possible subtle memory model error in ClassValue

2020-08-07 Thread John Rose
On Aug 7, 2020, at 2:35 PM, John Rose  wrote:
> 
> (Here’s a tidbit of JMM politics:  I once heard Doug Lea
> considering whether maybe all fields should be treated
> more like final fields.  I don’t know if this is still a live
> idea, but it would make this bug go way, since nearly all
> constructors would then get fences of some sort.)

Paul helpfully pointed me at Aleksey’s excellent description
and investigation of this proposal, which I had forgotten about:

https://shipilev.net/blog/2014/all-fields-are-final/
https://bugs.openjdk.java.net/browse/JDK-8031818

The JVM flag -XX:+AlwaysSafeConstructors might be
useful to enable this proposed feature and see if it suppresses
the bug.  (Don’t forget to unlock.)

Looks like the conversation is stalled, pending further
insights and problem fixes.  I had an idea for fixing one
of the problems with chained constructors, commented
on JDK-8032218.

— John

Re: Possible subtle memory model error in ClassValue

2020-08-07 Thread John Rose
On Aug 7, 2020, at 9:52 AM, Andrew Haley  wrote:
> 
> On 8/7/20 4:39 PM, David Lloyd wrote:
> 
> > However, I'm wondering if this isn't a side effect of what appears
> > to be an incorrect double-checked lock at lines 374-378 of
> > ClassValue.java [1].  In order for the write to the non-volatile
> > `cache` field of ClassValueMap, it is my understanding that there
> > must be some acquire/release edge from where the variable is
> > published to guarantee that all of the writes are visible to the
> > read site, and I'm not really sure that the exit-the-constructor
> > edge is going to match up with with the synchronized block which
> > protects a different non-volatile field.  And even if my feeling
> > that this is dodgy is valid, I'm not sure whether this NPE is a
> > possible outcome of that problem!
> 
> It certainly doesn't look right to me. AFAICS this is classic broken
> double-checked locking. It'd need some sort of fence after
> constructing the ClassValueMap instance and before publishing it.

Y’all are calling my baby ugly but he’s really cute if you look
at him from the right angle.

(First, a quick question:  This bug shows up on x86, right?
If so, we probably are not looking at a hardware reordering
problem but something stemming from Graal’s reordering
of operations compared to C1 and C2, perhaps after a different
set of inlining decisions uncharacteristic of C2.)

The bug indeed looks like a dropped fence at the end of the
constructor for ClassValueMap.  (I don’t see an easier way to
get a null at the NPE point.)  The fence might not actually be
dropped, but a write which the fence is supposed to fence
might have been reordered after the fence, and after an
unlock operation, allowing someone to see the initial
null after the unlock.

The double-check idiom breaks when the outer check
(not synchronized) gets half-baked data and acts on it.
This much-maligned idiom would not be broken in the
present case under the assumption that a data dependency
on type.classValueMap (set in initializeMap) will see
a fully initialized value of ClassValueMap.cacheArray.

Now we get into the fine print, and I agree there is a bug
here.  The easy fix is to repair the logic under which that
ugly everybody’s-hating-on-it double check idiom would
be in fact correct.

The fine print requires a couple different things that are not
quite fulfilled by the present code, and Graal has either
reordered the memory accesses to expose the problem, or
it (itself) has a complementary bug.  (Or it has an unrelated
bug, and you people are *staring* at my baby!)

First, as Paul noted, you need a final variable in your class
to get the benefit of a fence at the end of the constructor.
Oops #1:  ClassValueMap doesn’t have *any* finals.  That’s
awkward.  Two fixes for that:  (a) add a dummy final, and
(b) add a real fence in the constructor.  For better luck,
maybe put the fence at the end of sizeCache.

On machines (not x86) which need fences *before* final
reads, another explicit fence should be placed before the
unsynchronized read (or else inside the getCache method).

By the letter of the JMM, the helpful effects of the fence
at the end of the constructor are only guaranteed for
references which depend on final variables set by that
constructor, because only those final variables get a
“freeze” operation which stabilizes them (and values
dependently loaded via them).  Throwing in a dummy
final is therefore not guaranteed to work.  But throwing
in a fence will work.  One downside of the fence is that
the JMM is silent about fences, but the JMM is widely
recognized as a partial story, not the full or final story.

(Here’s a tidbit of JMM politics:  I once heard Doug Lea
considering whether maybe all fields should be treated
more like final fields.  I don’t know if this is still a live
idea, but it would make this bug go way, since nearly all
constructors would then get fences of some sort.)

Here’s a bit of explanatory (non-normative) text from the draft
POPL paper for JMM, where the parenthetic comment indicates
(I think) the sort of thing that is happening here:

> Let us now assume that the acquire and release do happen. As long as
> these actions take place after object has been constructed (and there
> is no code motion around the end of the constructor), the diffs that
> the second processor acquires are guaranteed to reflect the correctly
> constructed object.

Basically, the synchronization statement is expected to do a
release *after* the non-null value is stored.  It looks like this
is failing due to a lost and/or reordered fence.

Second, David says:

> I’m not really sure that the exit-the-constructor edge is going to match
> up with with the synchronized block which protects a different
> non-volatile field.

By the letter of the JMM (AFAIUI), this code is coloring way outside
the lines.  (Yes, I admit it.)  The problem is pretty fundamental.
I went and swapped in some JMM spec just now (don’t have it in
short 

Re: RFR[8238286]: 'Add new flatMap stream operation that is more amenable to pushing’

2020-06-24 Thread John Rose
On Jun 24, 2020, at 1:01 PM, John Rose  wrote:
> 
> What’s the performance model?  It seems like the SpinedBuffer
> implementation makes a worst-case assumption about the number
> of pending values, that there will be many instead of zero or one.
> 
> But I guess the pipeline stuff already works in terms of pushes, so
> the good news might be that this is really just a drill-down from the
> user API into the kinds of operations (push-flavored) that go on
> most of the time.

(I dove straight into the code and missed the implementation
note where you answered this question…  Never mind.)



Re: RFR[8238286]: 'Add new flatMap stream operation that is more amenable to pushing’

2020-06-24 Thread John Rose
I like this new API point a lot; it allows flexible, local, temporary
control inversion in the context of one stream transform.

What’s the performance model?  It seems like the SpinedBuffer
implementation makes a worst-case assumption about the number
of pending values, that there will be many instead of zero or one.

But I guess the pipeline stuff already works in terms of pushes, so
the good news might be that this is really just a drill-down from the
user API into the kinds of operations (push-flavored) that go on
most of the time.

OK, so I like the function but I have a beef with its bike shed
color.  First of all, this is a control-inversion (CPS) pattern,
which is very powerful but also notoriously hard to read.
I think that in Java APIs, at least in Stream APIs, code is
easier to read if the logical data flow is from left to right.

(This is a language-specific observation.  Apart from varargs,
Java method APIs read favorably when extra control arguments
are added onto the end of the argument list.  Also, the convention
for generic functional interfaces is that the return value type
goes to the right, e.g., R in Function.)

So the BiConsumer is backwards, because the logical return
should be written, if not as a true return (which would appear
at the end of type parameter lists), at the end of the incoming
parameters (and in the last type parameter).

I also think “multi” is needlessly “learned” sounding.  A simple
spatial preposition would work well: mapThrough, mapAcross, etc.
I think I prefer mapAcross because the term “across” can be read
adverbially: “we are mapping T across to Consumer”.

So:

mapAcross(BiConsumer> mapper)
mapAcrossToInt(BiConsumer mapper)
mapAcross​(IntObjConsumer mapper)

This does require additional FI’s like IntObjConsumer, but
I think that is a negligible cost.  Making the control inversion
*readable* is the high order bit here, not minimizing the number
of trivial FIs.

(I almost hear David Bowman, in his space suit, saying, “My API…
It’s full of bikesheds!”  There’s a meme for that.)

— John

On Jun 24, 2020, at 3:57 AM, Patrick Concannon  
wrote:
> 
> Hi,
> 
> Could someone please review myself and Julia's RFE and CSR for JDK-8238286 - 
> 'Add new flatMap stream operation that is more amenable to pushing’?
> 
> This proposal is to add a new flatMap-like operation:
> 
> ` Stream mapMulti(BiConsumer, ? super T> mapper)` 
> 
> to the java.util.Stream class. This operation is more receptive to the 
> pushing or yielding of values than the current implementation that internally 
> assembles values (if any) into one or more streams. This addition includes 
> the primitive variations of the operation i.e. mapMultiToInt, IntStream 
> mapMulti, etc.
> 
> issue: https://bugs.openjdk.java.net/browse/JDK-8238286 
> 
> csr: https://bugs.openjdk.java.net/browse/JDK-8248166 
> 
> 
> webrev: http://cr.openjdk.java.net/~pconcannon/8238286/webrevs/webrev.00/ 
> 
> specdiff: 
> http://cr.openjdk.java.net/~pconcannon/8238286/specdiff/specout.00/overview-summary.html
>   
> 
> 
> 
> Kind regards,
> Patrick & Julia



Re: (15) RFR: JDK-8247444: Trust final fields in records

2020-06-16 Thread John Rose
On Jun 15, 2020, at 2:58 PM, Mandy Chung  wrote:
> 
> Webrev: http://cr.openjdk.java.net/~mchung/jdk15/webrevs/8247444/webrev.00/ 
> 
> CSR: https://bugs.openjdk.java.net/browse/JDK-8247517 
> 

Good work.  Reviewed.  — John

Re: Review Request: JDK-8235521: Replacement API for Unsafe::ensureClassInitialized

2020-06-04 Thread John Rose
P.S.C. (post-send clarification)

> The workflow would be:
> 
> static final MethodHandle MH_ensureInit
>   = publicLookup().findVirtual(L…, “ensureInitialized”…);
> 

By that I mean the workflow of the dynamic language
implementor.  And after hitting “send” I realized that optimizing
that one case (of a findVirtual -> bindTo -> bindTo) is harder
than I thought.  The entire workflow below could instead be
a call to a wrapping function that takes an arbitrary “mh”
(as below) and wraps an init-barrier onto the front *if needed*.
It would have to take a Lookup and a Class, plus the mh.
It would handle all the internal stuff.  If the user wanted
a bare init MH, just pass in an empty MH as a “seed”.
And maybe take a short-circuit action if the “seed” comes
back unchanged; this is an alternative to isFullyInitialized,
but it seems sneaky.

class Lookup { …
/**
 * To the given method handle, prepend an action, if necessary, to trigger
 * initialization of the given class the first time the method handle is called.
 * If it would be illegal to call ensureInitialized on the given class from 
this lookup,
 * immediately report an error.  Otherwise, if the class is already fully 
initialized
 * return the method handle unchanged.  Otherwise, return a new method handle
 * (of the same type) which incorporates the necessary initialization action,
 * as if by a call to ensureInitialized.
 */
MethodHandle ensureInitialized(Class target, MethodHandle mh) {
  checkAccess(target);
  if (isFullyInitialized(target))   return mh;
  // we need to tack a “before action” onto mh to trigger inits:
 MethodHandle before = MH_ensureInit.bindTo(target);
 return collectArguments(mh, 0, before);
}

// note: IMPL_LOOKUP requires previous checkAccess call
private static final MethodHandle MH_ensureInit
  = publicLookup().findVirtual(L…, “ensureInitialized”…).bindTo(IMPL_LOOKUP);


Re: Review Request: JDK-8235521: Replacement API for Unsafe::ensureClassInitialized

2020-06-04 Thread John Rose
On Jun 3, 2020, at 7:38 PM, Mandy Chung  wrote:
> 
> On 6/3/20 6:24 PM, John Rose wrote:
>> On Jun 3, 2020, at 5:16 PM, Paul Sandoz > <mailto:paul.san...@oracle.com>> wrote:
>>> 
>>>  if (UNSAFE.shouldBeInitialized(cls)) {
>>>  UNSAFE.ensureClassInitialized(cls);
>>>  }
>>> 
>>> Although it seems redundant to perform the check, suggesting it is not 
>>> needed.
>> 
>> It’s useful (IIRC) in this kind of code:
>> 
>> MethodHandle mh = … something about cls ...;
>>  if (!UNSAFE.shouldBeInitialized(cls))
>> return mh;
>> // Oops, every eventual call to mh needs to do the check:
>> return filterArguments(mh, 0, UNSAFE_ensureClassInitialized.bindTo(cls));
>> 
> 
> It is useful for method handle machinery.   I don't see the need to expose it 
> as a supported API for java applications/framework.
> 
> Maybe useful for dynamic languages? 

Yes, it would be.  Since it’s a longer conversation, I suggest making
a *non-public* API point on Lookup, isFullyInitialized(Class).

Further suggestions:

See DirectMethodHandle.ensureInitialized for the use of both
Unsafe API points in concert; it has ticklish states involving
blocking on the thread doing the initialization.  I don’t want to
expose that yet.

But, factor all uses of those two Unsafe API in DMH.java through
the new Lookup API points, a public ensureInitialized (I like that
name best; it’s the one adopted internally to java.lang.invoke) and
also through a new non-public isFullyInitialized.  To avoid access
checks use IMPL_LOOKUP.

Note the inverted boolean sense.  The idea is that if isFullyInitialized
returns true, then there’s no point in arranging for future calls to
ensureInitialized.

If there is true demand for the second API point we can roll it out as
public.  But there’s a catch:  There’s a third initialization state,
“being initialized” which a dynamic language will also want to care
about.  So there’s a need for a third API point of some sort, not covered
by the second.

Here’s a quick thought about that third API point:  As things stand,
we are going to allow Lookup.findVirtual to apply to Lookup::ensureInitialized
to return a suitable access point, for future use, for anyone to use to
“tweak” any given class (modulo access restrictions).  Good.  We can
secretly add an optimization to the particular MH (and no other) so
that it uses the same machinery as MH’s returned for static getters
and setters and invokers that incorporate an init-barrier.

I’m not saying do this now, but maybe file a follow-up bug to add
that optimization to the resulting MH for Lookup::ensureInit,
if and when we add Lookup::isFullyInitialized.  That will give
the dynamic language people the same “hooks” we enjoy internally.

The workflow would be:

static final MethodHandle MH_ensureInit
   = publicLookup().findVirtual(L…, “ensureInitialized”…);

MethodHandle mh = …;
Class target = …;
if (!lookup().isFullyInitialized(target)) {
   // we need to tack a “before action” onto mh to trigger inits:
  MethodHandle before = MH_ensureInit.bindTo(lookup()).bindTo(target);
  mh = MethodHandles.collectArguments(mh, 0, before);
}
return mh;

— John

Re: Review Request: JDK-8235521: Replacement API for Unsafe::ensureClassInitialized

2020-06-04 Thread John Rose
On Jun 4, 2020, at 12:39 PM, mark.reinh...@oracle.com wrote:
> 
> I agree that the `ensure` prefix is useful here.
> 
> This method can only ensure that a class is initialized, so including
> `Class` in the method name seems redundant.  I’d go with the shorter
> `ensureInitialized`.

+1

Re: Review Request: JDK-8235521: Replacement API for Unsafe::ensureClassInitialized

2020-06-03 Thread John Rose
On Jun 3, 2020, at 5:16 PM, Paul Sandoz  wrote:
> 
>  if (UNSAFE.shouldBeInitialized(cls)) {
>  UNSAFE.ensureClassInitialized(cls);
>  }
> 
> Although it seems redundant to perform the check, suggesting it is not needed.

It’s useful (IIRC) in this kind of code:

MethodHandle mh = … something about cls ...;
 if (!UNSAFE.shouldBeInitialized(cls))
return mh;
// Oops, every eventual call to mh needs to do the check:
return filterArguments(mh, 0, UNSAFE_ensureClassInitialized.bindTo(cls));



Re: RFR JDK-8223347 Integration of Vector API (Incubator): Java API, implementation, and tests

2020-05-06 Thread John Rose
Thanks, Paul!  Talking with you about it helped me formulate my thoughts better.

> On May 6, 2020, at 9:02 AM, Paul Sandoz  wrote:
> 
> Hi John,
> 
> Thanks.  For the benefit of others, John and I had a long chat about this and 
> Joe’s CSR comments.
> 
> I have a better appreciation of your approach to the design and some of the 
> more subjective aspects to guide developers to API points, and to make code 
> more readable (that’s creative API design :-) ).
> 
> I agree with your assessment on size, lane count, and 
> Mask/Shuffle.vectorSpecies.
> 
> Re: VectorSpecies.fromByteArray, I now see the method Vector.reinterpretShape 
> appeals to VectorSpecies.fromByteArray for its specification.  Removal would 
> result in a less elegant specification of the behavior (making harder to 
> understand).  In that sense I think it’s worth its weight.  However, I would 
> suggest keeping in sync with a proposed change (on panama-dev) to the related 
> load/store byte[]/ByteBuffer methods, requiring they all accept a ByteOrder.
> I think this does bring up the wider point you raised about where factory 
> methods reside, and I agree about waiting for specialized generics, as that 
> might allow us to make better moves.
> 
> Paul.




Re: RFR JDK-8223347 Integration of Vector API (Incubator): Java API, implementation, and tests

2020-05-05 Thread John Rose
On May 1, 2020, at 1:36 PM, Paul Sandoz  wrote:
> 
> Hi Remi,
> 
> Thanks for the feedback. I am gonna need to go over this with John. Some 
> comments inline.
> 
> 
>> On Apr 20, 2020, at 6:31 AM, Remi Forax  wrote:
>> 
>> Hi Paul,
>> about the API part (not the implementation),
>> there are location where the same concept is described with a different 
>> names which doesn't help to understand how thing work
>> - vector length <=> vector lane count
>> - vector shape <=> vector bits size
>> - element size <=> lane size
>> 
>> "size" should never be used (VectorSpecies.elementSize() by example ) 
>> because it's never clear if it's the byte size or the bits size, using 
>> byteSize and bitsSize as used several times in the API should be used 
>> everywhere.
> 
> I agree with being explicit here e.g. 
> s/Vector.elementSize/Vector.elementBitSize and other usages.

The name “size” follows the precedent of the API point
Integer.SIZE, which is the size of an `int` in bits.  Although
bit-size and byte-size is an important thing to keep track of,
in the case of the Vector type, it lives primarily in registers,
like `int`, and only secondarily in memory (as `int[]` lives
primarily in memory).  When you need the size of a datum
in a register, you appeal to its bit count, not its byte count.

When you store in memory, then you might want its byte
size but for vectors that is an edge case.

So I don’t think we need bitSize, as a more explicit term for
size, for vectors and their lanes.  It is enough to say “size”.

> 
> I would prefer to consistently use length in methods to refer to the lane 
> count, it matches more closely to an array and for thinking about bounds 
> checking.

Yes, in this way (contrary to what I just said), a Vector is like
an array, and so it just has a length.  It’s a small array, and it
lives in a register, and it doesn’t support good random access,
but it has a length like an array.

> 
> How about on Vector:
> 
> /**
> * Returns number of vector lanes ({@code VLENGTH}), commonly referred to as 
> the lane count.
> *
> * @return the number of vector lanes
> */
> public abstract int length();

Yes.  Although laneCount would be a runner-up for naming this API point.
But a Vector is enough like an array or String that giving it a length
is the least confusing choice.  With documentation that says somewhere
that a vector looks like a little array formed from its lanes, so its length
as a vector is the count of its lanes.

Also, the semi-formal symbol {@code VLENGTH} is employed throughout
the javadoc, and would have to change to something even uglier like
{@code VLANECOUNT} if we got rid of Vector::length.

> And then change VectorShape.laneCount to VectorShape.laneLength.
> 
> ?

That sounds wrong, as if it were reporting the length of a lane somewhere.

Lane-count is correct.  As the javadoc explains clearly, a vector
has a certain number of lanes.

> 
> 
>> 
>> Initially, having VectorSpecies and VectorShape confuse me, it's not the 
>> shape of the vector but only it's bits size,
>> renaming VectorShape to VectorBitsSize will help, i think. 
>> 
> 
> Shape is quite a fundamental concept specified and used in the API, 
> representing the information capacity of a vector.
> 
> I think it would be misleading to refer to it only as VectorBitSize and 
> unwieldy to change the specification accordingly. In this sense it serves as 
> a useful abstraction to talk about vectors of the same bit size, and 
> operations that change that size or not (such as shape-changing or 
> shape-invariant).

More comments:

Shape also plays a critical role in the miscibility of vectors.
Two vectors can line up to perform a lane-wise operation if
and only if they have the same shape.  This is fundamental.
If we were to make these rules more “clever”, allowing
cross-shape lanewise operations, we’d give up on our current
performance levels, because the necessary dynamic checks
would put sand in the gears of the optimizer.

There are present and future hardware platforms, and software
abstractions, for which shape is more than just “the number of
my bits”.  The max-bits shape is a dynamically determined
shape which is not the same as any statically determined shape.
In the future, shapes may be distinguished according to whether
the involved vectors have intrinsic masks or not.  In the near
future, we might like to take the option of making synthetic
shapes, such as “a pair of 2-lane doubles of VLENGTH=4”,
which are often seen in other vector developer toolkits.

> 
>> You have the same values defined at several places in the API (sometimes 
>> with different names), usually it's aliases but it makes the API bigger that 
>> it should.
>> I think all the reference to the element type, vector byte size, vector bit 
>> size, element byte size, element byte size, vector lane count should not 
>> appear on Vector because they are accessible from Species,
>> and to go from a vector to the corresponding species, species() 

Re: Sponsor Request: 8241100: Make Boolean, Character, Byte, and Short implement Constable

2020-03-20 Thread John Rose
On Mar 20, 2020, at 10:49 AM, Jorn Vernee  wrote:
> 
> W.r.t. the source type being needed, I see the following 4 major cases:
> 
> 1. src=Prim & dst=Prim -> cast convert. For boolean the least-significant-bit 
> is used to convert it to/from an integer type.
> 2. src=Prim & dst=Ref   -> box the source value and potentially cast
> 3. src=Ref   & dst=Prim -> apply an unboxing conversion if possible and then 
> a casting conversion (with same trick for boolean as (1))
> 4. src=Ref   & dst=Ref   -> reference cast
> 
> Without the source type we can't disambiguate between cases (2) and (4), or 
> (1) and (3) because the bootstrap takes Object as an input.

I think that if you box the src=Prim from 1 or 2 and then
hand the resulting src=Ref to 3 or 4 you will get the same
thing.  The basic design here is that dst=Prim is the key
signal that unlocks the primitive conversions.  The type
of the src (Ref or Prim) doesn’t change the access to the
primitive conversions.

> For (2) and (4) the bootstrap invocation mechanism takes care of the boxing 
> for us, and the cast is performed by (4). For (1) and (3) the conversion code 
> for the latter handles conversion of wrapper types to Number and then to the 
> target primitive per (1). In the end things seems to work out to the same 
> result (though maybe I'm missing some subtle difference in a failure case).

Indeed they do.  That’s how I designed it.  The upshot is that
Object is a reliable carrier even of primitive values, for these
APIs.

> What I'm mostly worried about is that the source type already affects _how_ 
> the conversion is done, and the fact that this difference is not observable 
> seems somewhat incidental Coupled with the fact that asType and 
> explicitCastArguments also have access to both the source and destination 
> type, I think maybe the new bootstrap method should as well. After all, what 
> if the set of cases is extended (valhalla?) and/or not being able to 
> disambiguate starts to matter?

You shouldn’t be appealing to an internal function here for
designing the semantics.  The dual arguments to the internal
function are excess information; the shape of that function
predates the design decision described above.

Instead, appeal to some pseudocode, such as:

var id = identity(Object.class);
var mt = methodType(dst, Object.class);
var conv = explicitCastArguments(id, mt);
return conv.invoke(x);

I think that will give us all conversions we will need.

> 
> ---
> 
> I've written a more in-depth specification for the bootstrap, and 
> re-implemented it using explicitCastArguments, since that helps to catch 
> discrepancies between the input value and the source type. Here is the next 
> iteration: http://cr.openjdk.java.net/~jvernee/8241100/webrev.01
> 
> I've kept the source type for now, if it should be removed the specification 
> can be trimmed down (since there would be less cases to specify).
> 
> As for the name; I think "asType" might be confusing since the applied 
> conversion is not quite the same as MethodHandle::asType. Since the bootstrap 
> is implemented in terms of explicitCastArguments I went with "explicitCast", 
> how is that?

Yes, I like that name.

— John

Re: Sponsor Request: 8241100: Make Boolean, Character, Byte, and Short implement Constable

2020-03-18 Thread John Rose
On Mar 18, 2020, at 3:01 PM, John Rose  wrote:
> 
> explicitCastArguments

P.S.  That method has an intentionally scary name,
but behind the name, it’s just “asType, the JVM edition,
plus narrowing primitive conversions”.  The design
center for asType is safe and sane conversions as
allowed by variable assignment in Java, while the
other guy adds all the other conversions that might
be useful.  If you know statically what is the input
value type, as with BSMs working on CP data, then
“explicitCastArguments” is just as safe as “asType”
but it does more tricks for you.

Re: Sponsor Request: 8241100: Make Boolean, Character, Byte, and Short implement Constable

2020-03-18 Thread John Rose
On Mar 18, 2020, at 8:54 AM, Brian Goetz  wrote:
> 
> Overall, the approach seems sound, and I like that you are introducing only 
> one new bootstrap that is usable for a range of things.  A few comments and 
> questions.  
> 
> 1.  Not sure the explicit source type carries its weight, but whether it does 
> or not is probably informed by whatever we do for (3) below.  

It seems true to me that Object is an acceptable universal source
type.  You can convert anything in the CP to Object without loss
of information.  IIRC this is true even for the asType rules; they
are designed to take most of their information (when deciding
which conversions to apply) from the target type, not the source
type.

> 2.  Naming: “convert” seems kind of general; the name should suggest some 
> sort of asType conversion.  asType is pretty rich, and will get richer in the 
> future, so aligning with that is probably a good move.

+1  Suggest “convertAsType”.

> 3.  The spec on convert() needs to be fleshed out, as to exactly which set of 
> conversions are supported.  “Exactly the same set as asType” is OK; “similar 
> to asType” is kind of vague.  Similarly, the spec on parameters and 
> exceptions could be fleshed out a bit too.  

+1 I suggest a formal semantics along these lines:

var id = identity(sourceType);  // could be destType also
var mt = methodType(destType, sourceType);
var conv = explicitCastArguments(id, mt);
return conv.invoke(x);

The explicitCastArguments is linked to asType but adjusts
the handling of interfaces and of subword types (char
and boolean, byte and short) to more closely match the
conventions used by JVM bytecodes, when that differs
from the conventions used in the Java language.
I think it covers the conversions of JVM-level values
you are likely to encounter in the constant pool.

— John

Re: RFR: 8236641: Improve Set.of(...).iterator() warmup characteristics

2020-01-15 Thread John Rose



> On Jan 15, 2020, at 5:46 PM, Stuart Marks  wrote:
> 
> 
> 
>> On 1/13/20 8:03 AM, Claes Redestad wrote:
>> we're doing plenty of iterations over Set.of() instances during
>> bootstrap, which makes these operations show up prominently in
>> startup profiles. This patch refactors a few hot methods to get a
>> measureable startup improvement without regressing on targeted
>> microbenchmarks.
>> Bug:https://bugs.openjdk.java.net/browse/JDK-8236641
>> Webrev: http://cr.openjdk.java.net/~redestad/8236641/open.00/
> 
> OK, interesting and promising. A potential pitfall and a nitpick, though 
> perhaps an important one.
> 
>  65 static {
>  66 long color = 0x243F_6A88_85A3_08D3L; // pi slice
>  67 long seed = System.nanoTime();
>  68 int salt = (int)((color * seed) >> 16);  // avoid LSB and MSB
>  69 REVERSE = salt >= 0;
>  70 // Force SALT to be positive so we can use modulo rather
>  71 // than Math.floorMod
>  72 SALT = Math.abs(salt);
>  73 }
> 
> After this code runs I'm not convinced that SALT will be positive... consider 
> Integer.MIN_VALUE! Maybe just mask off the top bit instead? Thirty-one bits 
> of salt should be sufficient for our purposes here.

Good catch.  Should be:

SALT = salt >>> 1;

Pleasingly it uses exactly the 32 rando-bits. 

> 
> In SetNIterator::next there is this code:
> 
> 
> 789 do {
> 790 if (REVERSE) {
> 791 if (++idx >= len) {
> 792 idx = 0;
> 793 }
> 794 } else {
> 795 if (--idx < 0) {
> 796 idx = len - 1;
> 797 }
> 798 }
> 799 }
> 800 while ((element = elements[idx]) == null);
> 
> Ah, the rare do-while loop! I think this makes sense, but it reads oddly, 
> because it looks like the trailing "while" starts a new statement when in 
> fact it's part of the do-while statement. The trailing semicolon then makes 
> this look like a mistake. I'd suggest merging 799 and 800 so the last line of 
> the loop looks like this:
> 
> 799 } while ((element = elements[idx]) == null);
> 
> It's a small thing, but the "} while" on a line should be a tip-off that this 
> is the rare do-while loop and not something else.

+1

> 
> Thanks,
> 
> s'marks



Re: RFR: 8236850: Operations on constant List/Set.of(element) instances does not consistently constant fold

2020-01-13 Thread John Rose
On Jan 13, 2020, at 1:23 PM, Claes Redestad  wrote:
> 
> On 2020-01-13 22:18, Paul Sandoz wrote:
>> That’s fine, it was really for informational purposes in case it comes in 
>> handy later on.
> 
> Got it.
> 
>> Speculating: if we had such a public method (with minimal dependencies) I 
>> suspect we would reach for it in this case even if overkill.  It reduces the 
>> cognitive load, but as a downside it would not afford us the opportunity of 
>> a slice Pi:-)
> 
> Not sure it'd pull its own weight as a public API, but refactoring to a
> shared internal utility might be a good start?

Yes.  I want it in the JVM so I can use it there also, to fuzz things like
stack positions and TLB chunk sizes and monitor wait times and
central hash table placements.

As long as the core libs only need it for an occasional static constant,
the JNI call overhead will not be a problem.

— John

Re: RFR: 8236850: Operations on constant List/Set.of(element) instances does not consistently constant fold

2020-01-13 Thread John Rose
On Jan 13, 2020, at 11:49 AM, Paul Sandoz  wrote:
> 
> For information purposes: an alternative salt might be to use mix32 from 
> Splittable random:
> 
> /**
> * Returns the 32 high bits of Stafford variant 4 mix64 function as int.
> */
> private static int mix32(long z) {
>z = (z ^ (z >>> 33)) * 0x62a9d9ed799705f5L;
>return (int)(((z ^ (z >>> 28)) * 0xcb24d0a5c88c35b3L) >>> 32);
> }
> 
> as in:
> 
> SALT = mix32(System.nanoTime());

Thank you for mentioning that!  I get this google hit on it:

https://blogs.oracle.com/dave/a-simple-prng-construction-idiom

There are also similar functions in xxHash.

https://github.com/easyaspi314/xxhash-clean/blob/master/xxhash64-ref.c#L104

Here’s a scary list:

https://en.wikipedia.org/wiki/List_of_hash_functions

I think “doing it right” is overkill for this one place, but I hope we can put
something like that into the JVM or some other low-level place for use
whenever we need a cheap pseudo-random nonce of some sort.

Once there’s an API for it I think we will reach for it a little more often.
Having a common engine means we can afford to tune its predictability
qualities.  In both directions:  resistance to attack, and reproducibility
for debugging.  In the case of Set.of, the former concern doesn’t apply
much, but the latter might.

— John

Re: RFR: 8236850: Operations on constant List/Set.of(element) instances does not consistently constant fold

2020-01-12 Thread John Rose
On Jan 11, 2020, at 3:59 AM, fo...@univ-mlv.fr wrote:
> In my opinion, it's better to introduce an annotation @TrueScottmanFinal 
> instead of using @Stable in a way it was not designed to be used.

And even better than that is fixing “final” so it’s really final.

But @TrueScottmanFinal is surprisingly harder than @Stable.
The problem is that the JIT needs to distinguish two states
that look very similar:  1. The final variable is freshly created
and hasn’t been initialized yet, and 2. The final variable has
been initialized (redundantly) to its default value.

The design of @Stable neatly steps around this problem by
defining the contract to be the same in both cases.  A real
fix to “final”, or an intermediate @TrueScottmanFinal, seems
to require new mechanisms in the JVM for tracking initialization
(at least, initialization to default).  Either that or else we need
to convince ourselves that it’s OK for the JIT to capture the
default value from an uninitialized final and hold onto it
forever, which I don’t think works.  Or a hybrid scheme
where the JIT captures a default and polls for updates,
somehow, at some rate.  The problem with these latter
solutions is that (a) they probably will cause bugs, and (b)
they are probably (but not certainly) illegal in the JMM,
since the JMM does not clearly model the effects of threads
re-using JIT-ted code.

So, coming up from that deep dive, the result is that we
are stuck with @Stable for the present, not because it is
the perfect design for the job, but it is the best we can do
without adding complicated tracking to JIT constants.
It’s served us well, and will continue to do so until we do
that extra work.

— John

Re: RFR: 8236850: Operations on constant List/Set.of(element) instances does not consistently constant fold

2020-01-10 Thread John Rose
On Jan 10, 2020, at 6:48 PM, Claes Redestad  wrote:
> 
> Yes. The XOR seems pointless with this approach, but some shifting
> might be good.

A single multiply mixes the bits of its inputs.  If one input is
“random” or “white” (50% 1’s irregularly spaced) then the output
will probably look similar.  The output bits are not equally mixed,
though:  The LSB depends only on two bits of the input, and the
MSB (barring rare carry-outs) depends mostly on two other bits
of the input.  But the middle bits depend on most of the input bits.

So, given the goal of making a very simple, good-enough mixing
expression to get a 32-bit salt, this one is relatively good:

   long COLOR = 0x243F_6A88_85A3_08D3L;  // pi slice
   long SEED = System.nanoTime();  // or command line value
   int SALT = (int)( (COLOR * SEED) >> 16 );  // avoid LSB and MSB

In the longer term, I think the JVM should provide salt values
both for itself and (a few) core libs., and should allow those values
to be (partially) configured from the command line.  (Crash dumps
should report the SEED values used for reproducing the error.)
Given such a facility, it would be reasonable to upgrade it to use
better quality entropy sources and hashing (maybe optionally
crypto-quality, though I have reservations about that).  That
would make the variable behaviors unpredictable in practice.
Except when they are intentionally configured to be predictable.
An algorithm like xxHash would be a good starting point; it’s
cheap and salty.

— John

Re: RFR: 8236850: Operations on constant List/Set.of(element) instances does not consistently constant fold

2020-01-10 Thread John Rose
On Jan 10, 2020, at 10:47 AM, Remi Forax  wrote:

> Seem to be a JIT bug to me,
> the fields of Set12 are declared final not @Stable (even if the field storing 
> the instanceof of Set&2 itself is marked @Stable) so there is no reason not 
> not let the constant folding happen.

It’s part of the contract of @Stable that default values (null) are not
inlined.  The reason for this is that some @Stable variables are lazy,
and the JIT needs to avoid assuming that default values stay that
way forever.

So, Claes is right that the empty slots need filling with a non-default
(non-null) value.

I think I’d prefer to see the following filler value rather `this`:

private static final EMPTY = new Object();

It might make escape analysis more robust if we don’t have objects
randomly storing pointer to themselves.

That value `SALT >= 0` inverts regularly every four seconds,
doesn’t it?  Not very salty, that.  I suggest adding a multiply to
make that sign bit more spicy:

 long nt = System.nanoTime();
 nt *=  0x243F_6A88_85A3_08D3L;  // a slice of pi
 SALT = (int)((nt >>> 32) ^ nt);
REVERSE = SALT >= 0;

The hex number is the fractional part of pi (in hex).
Any arbitrary odd number with a balance of 1’s and 0’s will mix
things up well enough to produce an irregularly varying sign bit.
Using a well-known number makes it obvious there’s nothing
up our sleeve.

— John

P.S.  Not for this change set, but such salty values should be
derived from an optionally specified seed value, so JVM executions
can be made reproducible.  Could be a diagnostic flag:

java -XX:+UnlockDiagnosticVMOptions -XX:VMSaltSeed=42

The salt seed would be initialized from the nanotime or some such.



Re: RFR JDK-8234049: Implementation of Memory Access API (Incubator)

2019-12-12 Thread John Rose
On Dec 9, 2019, at 10:16 AM, Peter Levart  wrote:
> What do you think? Might this be better performance wise?

Yes, a fast positive proof that the access is allowed can be
implemented along the lines you mention.

The owner token could be overloaded in additional ways, if and
when we do more patterns of sharing.

The case of a permanently shareable heap-allocated segment could
be handled by a similar fast check, if some constant sentinel value
ALL_THREADS is stored in the owner instead Thread.current().

The case of a temporarily shareable segment, held jointly by several
threads, could be handled by an owner token which was suitably tied
to those threads.

This suggests that “owner == currentThread()” could be at some point
generalized to “owner.accepts(currentThread())” assuming that the
fast paths of OwnerSet::accepts could be optimized well.

— John

Re: RFR: JEP 359: Records (Preview) (full code)

2019-12-02 Thread John Rose
On Dec 2, 2019, at 4:36 PM, David Holmes  wrote:
> 
> You are using CHECK_0 in RecordComponent::create but that method returns oop. 
> That seems wrong to me, but I see many other oop returning methods also use 
> CHECK_0 so I'll take that up separately. (It's only a cosmetic issue.)

CHECK_NULL is clearly preferable.  At some point IIRC it was more
than just cosmetics, because we were getting warning noise about
`return 0` converting to NULL, from some C++ compiler or other.
AFAIK such warnings can still happen.

— John

Re: New candidate JEP: 370: Foreign-Memory Access API

2019-12-02 Thread John Rose
On Dec 1, 2019, at 8:07 AM, David Lloyd  wrote:
> 
> Okay, anyway I'm glad that I'm not the first person to have thought of
> these use cases.  The API looks good!

Thanks for reviewing and commenting!  We need to “shake out” our
designs exactly like that.

— John

Re: New candidate JEP: 370: Foreign-Memory Access API

2019-11-29 Thread John Rose
On Nov 29, 2019, at 4:24 AM, David Lloyd  wrote:
> 
> Makes sense, it's still early.  But now the seed is planted. :)

(It’s better documented.  Most of the points made here were not new.)



Re: [14] RFR (S): 8234923: Missed call_site_target nmethod dependency for non-fully initialized ConstantCallSite instance

2019-11-29 Thread John Rose
Reviewed.

> On Nov 29, 2019, at 7:55 AM, Vladimir Ivanov  
> wrote:
> 
> http://cr.openjdk.java.net/~vlivanov/8234923/webrev.00/
> https://bugs.openjdk.java.net/browse/JDK-8234923



Re: RFR: 8234335: Remove line break in class declaration in java.base

2019-11-19 Thread John Rose
On Nov 19, 2019, at 4:00 AM, Lance Andersen  wrote:
> 
> Seems to be a “your milage varies”.  I am fine with whatever the final 
> decision is.  However, I do believe the comment above reads better and aligns 
> the methods better.

FWIW, and as author of many of the lines being changed, I prefer that comment 
on a separate from the actual modifiers.  I think the basic modifier patterns 
are easier to recognize quickly when comments and annotations are separate.

— John

P.S. And for the record, my intention of writing non-public is to ensure make 
it clear to maintainers that the absence of any modifier is both intentional 
and important (to security, in most cases).  The language doesn’t give any more 
explicit way to document that default access is required.



Re: RFR 8233920: MethodHandles::tryFinally generates illegal bytecode for long/double return types

2019-11-13 Thread John Rose
On Nov 13, 2019, at 12:39 PM, Claes Redestad  wrote:
> Similar to other promising candidates like constant folding I think it
> opens up some more general questions about observability. For example:
> do we retain the original bytecode and mappings everything back to it
> so that a debugger would be none the wiser about any inlining that may
> have happened? Do we "simply" deoptimize and redefine everything back to
> the stashed original when someone hooks in a debugger..? (Do we go so
> far as to allow opting out of debuggers entirely?)

Yep.  These questions are why link-time optimization is not a slam dunk.
If it were, I suppose we’d have it already.

Debugging is especially hard.  We don’t want to give it up, but any
program transformation is going to make debugging harder to implement.
You should see what the JITs do to support deopt; like exact GC, it’s
something that makes them different animals from the usual llvm/gcc
compilers.

That said, a bytecode-to-bytecode “deoptimization” is easier to specify
and think about than a machine-code-to-bytecode deopt, which is what
the JITs support.  Juicy problems everywhere!

Also I wonder if stashing classifies for debugging only will motivate a partial
resurrection of pack200, for storing said things “cold” on disk.



Re: RFR 8233920: MethodHandles::tryFinally generates illegal bytecode for long/double return types

2019-11-13 Thread John Rose
On Nov 13, 2019, at 11:24 AM, Claes Redestad  wrote:
> 
>> Thanks for suggesting that Vladimir.  That indeed is how the existing code 
>> is organized.
> 
> Yes, not very startup friendly... ;-)

No, but it’s maintainer friendly.  We can walk and chew gum at the same time.

Is it time to think about a jlink plugin which inlines simple method calls?
Perhaps we need to wait until we understand (semi-)closed-world packaging.

Long, long ago, a very early version of javac did inlining of simple method 
calls.
That functionality was removed when we realized that this was placing “getfield”
instructions in classes which didn’t have access to the referenced private 
fields,
causing code to break.  It also messed up the binary compatibility story 
required
to make sense of independently recompiled class files.  How innocent we were
back then.  We won’t make those same mistakes again, but we could try the
same trick again, more carefully, in jlink or some other static packaging 
workflow.

— John

Re: RFR 8233920: MethodHandles::tryFinally generates illegal bytecode for long/double return types

2019-11-13 Thread John Rose
On Nov 13, 2019, at 2:28 AM, Vladimir Ivanov mailto:vladimir.x.iva...@oracle.com>> wrote:
> 
>private void emitPopInsn(BasicType type) {
>  mv.visitVarInsn(popInsnOpcode(type));
>}
> 

Thanks for suggesting that Vladimir.  That indeed is how the existing code is 
organized.

Re: RFR: CSR JDK-8233117 Escape Sequences For Line Continuation and White Space (Preview)

2019-11-05 Thread John Rose
On Nov 5, 2019, at 10:17 AM, Remi Forax  wrote:
> 
> the rationale to add \ is well explain but why do you want to 
> introduce \s given we already have \u0020 ?

Remi, you are an expert in the field and you got fooled!  This is *exactly* why 
we need \s.

The \u syntax is expanded *before* token scanning, which means that the 
poor programmer
who tries \u0020 will see it stripped by the whitespace stripper.  Using \u 
where  is
an ASCII code point is an anti-pattern!  (I think we would help our users if we 
deprecated it.)

You might have said \040, but that too is hard to remember and to read and to 
use correctly.
\s is the answer.

— John

Re: RFR: 8231355: Remove unused utility methods in libjava

2019-10-07 Thread John Rose
On Oct 7, 2019, at 11:38 AM, Claes Redestad  wrote:
> 
> Bug:https://bugs.openjdk.java.net/browse/JDK-8231355 
> 
> Webrev: http://cr.openjdk.java.net/~redestad/8231355/open.00/ 
> 
> 
> Testing: tier1+2

"436 lines changed: 2 ins; 433 del; 1 mod”
Nice!



Re: RFR: JDK-8222373 Improve CDS performance for custom class loaders

2019-06-24 Thread John Rose
On Jun 20, 2019, at 12:12 AM, Ioi Lam  wrote:
> ...
> I have a rough idea -- let's have a higher-level representation of the 
> bytecode stream than byte[] or ByteBuffer, to make optimization possible.

Template classes will need something similar, so maybe there is a common design 
point in here.

The key feature of template classes is that their loading sequence is split 
into parts:

1. Load the template, defining an abstract API and preprocessing incomplete 
parts.
2. Specialize the loaded template, completing all parts, loading a specialized 
class ("species").

Step 1 happens once per template.  Step 2 can happen any number of types, with 
varying template parameters.

I think that, at least in the internals, there will be a BSM-like (now, that's 
a surprise) template specialization function (TSF) declared in step 1 and 
executed in step 2.

The TSF will need to consult the results of step one and request the JVM to 
combine them into the desired species.  Ad hoc logic may be involved, such as 
detecting if T in List is Comparable and mixing in Comparable into the 
resulting List, with a standard comparison method (e.g. lexicographic compare 
lifted from the elemental compares).

Key questions:

A. What API reflects the template chunks loaded in step 1?  Probably not just 
byte streams, more like what you are reaching for here, Ioi.

B. What API loads the customized chunks?  Probably (again) not today's 
ClassLoader which requires byte streams.

I'm slowly working on an answer to A, in the form of a "class excavator" which 
pulls out structural information from live JVM metadata, without "deadening" it 
into a byte stream.  Not ready for prime time, but the basic idea is to present 
the logical schema of the loaded class file (in terms of the original classfile 
structure) in a style similar to the existing constant pool reflection API 
(which is JDK internal).

Once we get a fuller answer for A we can try to invert it to get an answer for 
B.  That is, if we know what we'd like to see in the JVM (via the excavator) we 
want to tell the JVM to establish some new species definition, related to the 
excavated data.

For most JDK work we can wrap low-level excavator/establisher mechanisms (which 
don't need to be very O-O) in a thin layer of value types and interfaces.

I'm using new terms to help me think of this as a new kind of API, not just a 
small variation on class loading or reflection.  The new feature is that the 
reflective part comes first, and is followed (as a sort of reversed operation) 
by the loading part.  Also new is that we don't want to abstract everything 
through serialized byte streams, since they make it very hard to share 
features—but (I think) species need to share template metadata, not just make 
copies of it.

— John

> So we could have a new API in ClassLoader
> 
> protected final Class defineClass(String name, String location,
> ProtectionDomain protectionDomain)
> 
> and its behavior is equivalent to the following
> 
>{
> byte[] b = read_buffer_from(location);
> return defineClass(name, b, 0, b.len, protectionDomain);
>}
> 
> 
> examples would be:
> 
>  defineClass("java/lang/Object", "jrt:/java.base/java/lang/Object.class", 
> NULL);
>  defineClass("com/foo/Bar", "file://path/com.foo.jar!com/foo/Bar.class", 
> myProtectionDomain);
> 
> Note that the type and value of  is just for illustrative purposes. 
> We might use a different type (URI??). It's just a way to name a location 
> that you can read a byte buffer from.
> 
> The protectionDomain will need to be created by the caller. The use of the 
> protectionDomain will be no different than the existing defineClass() APIs. 
> Specifically, it will not be used in any way to fetch the buffer.
> 
> When CDS is enabled, the VM can check if the name+location matches a class in 
> the CDS archive. If so, the class is loaded without fetching the buffer.
> 
> The caller doesn't need to know if CDS is enabled or not.
> 
> 
> (We probably don't want a String type but a more abstract type. That way we 
> can evolve it to allow more complex representations, such as "read the 
> bytecode stream from here, but replace the method name "Foo" to "Bar", and 
> add a new integer field "X" 
> 
> If John Rose was to design this, I am sure he will call it something like 
> BytecodeStreamDynamic :-)
> 
> This may actually reduce the use of ASM. E.g., today people are forced to 
> write their own bytecodes, even if they just want some simple transformation 
> on template classes).
> 
> 
> Thanks
> - Ioi



Re: RFR: 8215017: Improve String::equals warmup characteristics

2019-04-11 Thread John Rose
On Dec 7, 2018, at 4:11 PM, Claes Redestad  wrote:
> 
> - Jim's proposal to use Arrays.equals is _interesting_: it improves
> peak performance on some inputs but regress it on others. I'll defer
> that to a future RFE as it needs a more thorough examination.
> 
> - what we can do is simplify to only use StringLatin1.equals. If I'm not
> completely mistaken these are all semantically neutral (and
> StringUTF16.equals effectively redundant). If I'm completely wrong here
> I'll welcome the learning opportunity. :-)

If they are semantically neutral then they belong in
the same place as the neutral intrinsics, which is
jdk.internal.util.ArraysSupport.  This in turn calls
the question of how the operation differs from
the existing mismatch intrinsic.

I'll go out on a limb here and say that the standard
ArraysSupport.mismatch intrinsic can probably be
tuned to cover string compares without hurting its
other customers.  Often such intrinsics are only
engineered for throughput on long inputs.
Later on we may find they can be additionally
tuned for other workloads, by adjusting the setup
logic, or (alternatively) they can be factored into
multiple entry points for specific use cases, with
complete sharing of assembly-level code except
for differing setup logic for the different use cases.
This sort of thing is the case with our arraycopy
intrinsics (used by the JIT) and may well also be
the case for array matching intrinsics.

I agree the above questions can be answered in
a separate investigation.

— John

Re: RFR: 8221836: Avoid recalculating String.hash when zero

2019-04-08 Thread John Rose
I agree that this is a good change, and you can use me as a reviewer.

I disagree with Aleksey; it's a new technique but not complex
to document or understand.  The two state components are
independent in their action; there is no race between their
state changes.

Meanwhile, there are two reasons I want the change:

1. Less risk of spurious updates to COW memory segments in
shared archives.

2. No risk of hashcode recomputation for the 2^-32 case.
This might seem laughable, until you remember that it's exactly
those cases that DOS attackers like to create.

Both are defense in depth, against performance potholes and
intentional attacks.

If we spent as much time documenting this change as we
spent complaining about its supposed uselessness, we'd
be done.

— John

On Apr 8, 2019, at 8:24 AM, Claes Redestad  wrote:
> 
> Right, this and possibly reducing latency when running with String
> deduplication enabled might be the more tangible benefits. Removing
> a cause for spurious performance degradations is nice, but mainly
> theoretical. There's likely a pre-existing negative interaction
> between string dedup and String archiving that would need to be
> resolved either way.
> 
> I've simplified the patch somewhat and folded set_hash/hash into
> hash_code (since direct manipulation of the hash field should be
> avoided), along with a comment to try and explain and caution about the
> data race:
> 
> http://cr.openjdk.java.net/~redestad/8221836/open.02/
> 
> Thanks!
> 
> /Claes
> 
> On 2019-04-08 12:24, Peter Levart wrote:
>> I think the most benefit in this patch is the emptyString.hashCode() 
>> speedup. By holding a boolean flag in the String object itself, there is one 
>> less de-reference to be made on fast-path in case of empty string. Which 
>> shows in microbenchmark and would show even more if code iterated many 
>> different instances of empty strings that don't share the underlying array 
>> invoking .hashCode() on them. Which, I admit, is not a frequent case in 
>> practice, but hey, it is a speedup after all.
>> Regards, Peter



Re: RFR: 8221723: Avoid storing zero to String.hash

2019-04-01 Thread John Rose
On Apr 1, 2019, at 12:50 PM, dean.l...@oracle.com wrote:
> 
> Wouldn't it be better to write a non-0 value when the computed hash code is 
> 0, so we don't have to recompute it?  Is there some advantage to writing 0 
> instead of any other value, such as 1?

Zero is the easiest sentinel value because it's the default
value for int.

*Any* 32-bit sentinel you choose will be the legitimate hashCode
of *some* string.  That is to say, the range of String.hashCode
is all 2^32 points of the int domain.  That's why we need an extra
bit somewhere if we want to distinguish a sentinel (like zero) from
an identical legitimate hash code that has been cached.

Luckily, there is plenty of "slack" in the memory layout of String.
We can find a bit, if we want to go there.

— John

Re: Proposal: JDK-8148917 Enhanced-For Statement Should Allow Streams

2019-03-11 Thread John Rose
On Mar 7, 2019, at 4:33 AM, Tagir Valeev  wrote:
> 
> Hello, Remi!
> 
> It actually works thanks to auto-boxing/unboxing. E.g. this
> implementation works:
> 
> static Iterable range(int from, int to) {
>  return () -> new PrimitiveIterator.OfInt() {
>int cur = from;
> 
>@Override
>public int nextInt() {
>  if (cur >= to) {
>throw new NoSuchElementException();
>  }
>  return cur++;
>}
> 
>@Override
>public boolean hasNext() {
>  return cur < to;
>}
>  };
> }
> 
> public static void main(String[] args) {
>  for(int i : range(0, 100)) {
>System.out.println(i);
>  }
> }
> 
> It correctly compiles and prints numbers from 0 to 99. As IntStream
> extends BaseStream and BaseStream BaseStream> defines Iterator iterator(), it would be no
> problem with using IntStream.range in such code pattern were
> BaseStream extends IterableOnce.
> 
> Of course this produces unnecessary garbage, as I said.

This is a relatively simple kind of garbage to remove, because
it is made (by calls to Integer.valueOf) at the adapted boundaries
of the iterator, which are readily inlined into the loop.  The deeper
internal logic of the range function is box-free, as is the loop itself,
so the garbage is relatively easy to remove.

That said, "out of the box" there is lots of garbage created unless
-XX:+AggressiveUnboxing is turned on.  I assume Graal does a good
job on it, even without this switch.

If we ever succeed in suppressing the identity of java.lang.Integer,
and/or after making functions like range into reified generics, the
boxing will go away even more directly and simply.

So, from the JIT engineering point of view, I would classify this
particular boxing problem as relatively short-lived.

— John

P.S. I also saw the fiery objections to the range() idiom, and I have
to disagree with those disagreers!  I prefer the range idiom to the
multi-part for-init-test-step idiom, because counters are semantically
*more complex* than ranges, from the viewpoints I typically use to
reason about programs.  (There are times when a separate counter
state *is* preferable to me, but usually when there is some sort of
extra "i += skip" side effect in the loop.)  That's got to be a matter of
taste.  I suppose some people habitually reason about programs
in lower-level terms, like x86 instructions, and there a counter is no
more complex than a range; for those people there's no reason to
learn a different concept than a for-loop.  And surely there are other
points of view that favor the good old for-loop-with-int-counter.

I'm more tolerant of new-fangled range-based notations because they
let me avoid having to reason about side-effect-laden entities
like counters, and that feels like a good thing to me.  Tastes vary.

> 
> With best regards,
> Tagir Valeev.
> 
> On Wed, Mar 6, 2019 at 7:37 PM Remi Forax  wrote:
>> 
>> Hi Tagir,
>> try to do it now and you will see that you can't, because you can not write 
>> Iterable yet.
>> Once we will get generics over value types, it will be a no-brainer.
>> 
>> Rémi
>> 
>> - Mail original -
>>> De: "Tagir Valeev" 
>>> À: "Stuart Marks" 
>>> Cc: "core-libs-dev" 
>>> Envoyé: Mercredi 6 Mars 2019 11:10:41
>>> Objet: Re: Proposal: JDK-8148917 Enhanced-For Statement Should Allow Streams
>> 
>>> Hello!
>>> 
>>> By the way one of the painful code patterns in modern Java is `for(int
>>> i = 0; i>> newbies and prone to errors as the variable need to be repeated three
>>> times. Also the variable is not effectively final, despite it never
>>> changes within the loop body, so could have been considered as
>>> redeclared on every loop iteration (like in for-each). With the new
>>> proposal it's possible to write `for(int i : range(0, bound).boxed())`
>>> (assuming import static j.u.s.IntStream.range), which looks much
>>> better, though it has obvious performance drawback. Moving
>>> IterableOnce to BaseStream would enable to use `for(int i : range(0,
>>> bound))` which looks even better, though again we have plenty of
>>> garbage (but considerably less than in previous case!). I wonder
>>> whether Java could evolve to the point where such kind of code would
>>> be a recommended way to iterate over subsequent integer values without
>>> any performance handicap.
>>> 
>>> With best regards,
>>> Tagir Valeev.
>>> 
>>> On Fri, Mar 1, 2019 at 9:47 AM Stuart Marks  wrote:
 
 Hi all,
 
 Please review and comment on this proposal to allow Stream instances to be 
 used
 in enhanced-for ("for-each") loops.
 
 Abstract
 
 Occasionally it's useful to iterate a Stream using a conventional loop. 
 However,
 the Stream interface doesn't implement Iterable, and therefore streams 
 cannot be
 used with the enhanced-for statement. This is a proposal to remedy that
 situation by introducing a new interface IterableOnce that is a subtype of
 Iterable, and then retrofitting the Stream interface to implement it. 

Re: RFR: JDK-8216558: Lookup.unreflectSetter(Field) fails to throw IllegalAccessException for final fields

2019-01-14 Thread John Rose
On Jan 11, 2019, at 4:53 PM, David Holmes  wrote:
> 
> I think there is a problem knowing when "access check" means just access 
> check and when it means "access check plus the special hack for setting final 
> fields". I'm not reading any final field semantics into the existing text, 
> because bytecode does not know about setAccessible and final fields.

The "bytecode behavior" of an unreflectSetter MH is clearly documented
by appealing to a call to Field::set.  That is a pre-existing specification
that proves that the observed behavior for statics is a bug, and for
non-statics is a feature.  No CSR is needed, and, David, you are simply
looking in the wrong places.  I admit that the right place is tricky to find.

> I can see an argument to be made for the case where setAccessible(true) has 
> been called to disable language level access checks, and you want the MH to 
> follow the same access rules. But that does not to me imply that the ability 
> to write to a final field should be part of that. If that is intended then it 
> needs to be very, very clearly spelt out.

Yes, the documentation should be improved, so that email threads like
this won't be necessary to tease out the meaning of the spec.  I placed
a proposal for *clarifying* (not *changing*) it in the bug comments.

> And as per my other email there is already a problem with the bytecode 
> equivalence of MH semantics and final fields, because you can set final 
> fields in  or  (though only once). I don't think MH captures 
> that - nor can it.

That's true.  The most it can do is capture the effect, if any, of
setAcc(true) on a reflected Field.

> Which means we can't defer to bytecode semantics here but rather have to very 
> clearly define how MH interact with final fields.

And that's already done, except for the "clearly" part.

Thanks for raising this issue!

— John

diff --git a/src/java.base/share/classes/java/lang/invoke/MethodHandles.java 
b/src/java.base/share/classes/java/lang/invoke/MethodHandles.java
--- a/src/java.base/share/classes/java/lang/invoke/MethodHandles.java
+++ b/src/java.base/share/classes/java/lang/invoke/MethodHandles.java
@@ -421,6 +421,10 @@
  * because the desired class member is missing, or because the
  * desired class member is not accessible to the lookup class, or
  * because the lookup object is not trusted enough to access the member.
+ * In the case of a field setter function on a {@code final} field,
+ * finality enforcement is treated as a kind of access control,
+ * and the lookup will fail, except in special cases of
+ * {@link Lookup#unreflectSetter Lookup.unreflectSetter}.
  * In any of these cases, a {@code ReflectiveOperationException} will be
  * thrown from the attempted lookup.  The exact class will be one of
  * the following:
@@ -1436,6 +1440,7 @@
  * @return a method handle which can store values into the field
  * @throws NoSuchFieldException if the field does not exist
  * @throws IllegalAccessException if access checking fails, or if the 
field is {@code static}
+ *or {@code final}
  * @exception SecurityException if a security manager is present and it
  *  refuses access
  * @throws NullPointerException if any argument is null
@@ -1559,6 +1564,7 @@
  * @return a method handle which can store values into the field
  * @throws NoSuchFieldException if the field does not exist
  * @throws IllegalAccessException if access checking fails, or if the 
field is not {@code static}
+ *or is {@code final}
  * @exception SecurityException if a security manager is present and it
  *  refuses access
  * @throws NullPointerException if any argument is null
@@ -1868,19 +1874,29 @@
 /**
  * Produces a method handle giving write access to a reflected field.
  * The type of the method handle will have a void return type.
- * If the field is static, the method handle will take a single
+ * If the field is {@code static}, the method handle will take a single
  * argument, of the field's value type, the value to be stored.
  * Otherwise, the two arguments will be the instance containing
  * the field, and the value to be stored.
- * If the field's {@code accessible} flag is not set,
+ * If the {@code Field} object's {@code accessible} flag is not set,
  * access checking is performed immediately on behalf of the lookup 
class.
  * 
- * If the field is static, and
+ * If the field is {@code final}, write access will not be
+ * allowed and access checking will fail, except under certain
+ * narrow circumstances documented for {@link Field#set Field.set}.
+ * A method handle is returned only if a 

Re: RFR: JDK-8215510: j.l.c.ClassDesc is accepting descriptors not allowed by the spec

2019-01-09 Thread John Rose
On Jan 9, 2019, at 2:04 PM, John Rose  wrote:
> 
> you might consider using
> a little combinatorial code to generate bad and good class
> names

P.S. To motivate this suggestion a bit more:  I found no
problem with your manually-written test vectors of bad
and good names, but I also found it difficult to be confident
that they covered the ground adequately.  Combinatorial
logic for generating test vectors is often overkill, but it is
easier to be confident about.  That in turn makes us more
confident that the actual code itself is free of bugs.

Also, when I use such tactics in my own test development,
I often find more bugs in my code, than with my first cut
tests, which are small ad hoc test vectors.  I won't promise
the same for you, but it's a possibility.


Re: RFR: JDK-8215510: j.l.c.ClassDesc is accepting descriptors not allowed by the spec

2019-01-09 Thread John Rose
Nice work.  A couple of small points:

A qualified class name is one that has a package prefix.
So it is surprising that verifyUnqualifiedClassName allows
a package prefix.  Maybe it needs a different name.

The testing looks adequate, but you might consider using
a little combinatorial code to generate bad and good class
names.  Such code could be shared between the unit
tests that test x.y names and Lx/y; names.

Something like:

char goodsep = '.' or '/';
ArrayList badnames, goodnames; …
goodnames.addAll(goodunqnames);
badnames.addAll(badunqnames);
for (int i = 0; i < 2; i++) { 
for (String pkg : badpkgnames) {
…prepend pkg+goodsep to each badnames and goodnames and add to badnames...
}
for (char sep : "./;[".toCharArray()) {
for (String pkg : goodpkgnames) {
if (sep == goodsep)
  …prepend pkg+sep to each goodnames and add to goodnames…
else
…prepend pkg+sep to each goodnames and add to badnames…
}
…prepend sep to each goodnames and add to badnames…
…appendpend sep to each goodnames and add to badnames…
}
return List.of(badnames, goodnames); // return for use in unit test

> On Jan 7, 2019, at 10:17 AM, Vicente Romero  wrote:
> 
> I have updated the webrev after additional feedback from the TCK tester 
> please check last version at [1]
> 
> Thanks,
> Vicente
> 
> [1] http://cr.openjdk.java.net/~vromero/8215510/webrev.01
> 
> 
> On 1/3/19 12:21 PM, Vicente Romero wrote:
>> Please review the fix for bug [1] at [2]. Basically the constants API 
>> introduced as part of JEP-334 [3] were accepting descriptors and class names 
>> not allowed by the spec. This implementation fixes several issues found by 
>> TCK tests on JEP-334,
>> 
>> Thanks,
>> Vicente
>> 
>> [1] https://bugs.openjdk.java.net/browse/JDK-8215510
>> [2] http://cr.openjdk.java.net/~vromero/8215510/webrev.00/
>> [3] https://bugs.openjdk.java.net/browse/JDK-8203252
> 



Re: Feature suggestion: Add static equals methods to Float and Double

2019-01-07 Thread John Rose
As I think you expect, isSubstitutible(x,y) will mean that x and y are 
equivalent
for all practical purposes.  One hard question is nailing down what are
"all practical purposes".  Certainly it's unfair to flip a coin while evaluating
x and y separately, and claim that a distinct outcome proves a difference.
What about viewing the bits of x and y using the Unsafe API?  That's
unfair also, since it opens the door to implementation-dependent behavior
which might detect a difference (an irrelevant difference) between x and y.

Now, floatToRawIntBits can detect differences between NaNs which
have different numeric codes.  Are two such NaNs substitutable or not?
The evidence in favor:
 - They become equivalent when boxed in a Float, and Float claims to
  be an all-purpose box for float values.
 - The extra information produced by floatToRawIntBits is implementation
  specific, and in particular processor dependent.
 - Joe Darcy suggested to me that some processors, like x87, may
  perturb NaN bits (turning off the "signalling" bit, for example), even
  if the float value is simply bound to a parameter.  This means that
  the operand to floatToRawIntBits *might*, in compiled code, possibly
  have *some* of its bits perturbed.  (Thanks, Joe, for that and similar
  hair-raising stories.)
 - The previous point implies that compiled code and interpreted code
  might, in the same JVM instance, produce different results on the
  same argument.  That is quite implementation specific indeed!

The evidence against:
 - The existing standard API point floatToRawIntBits is not going away.
  So the isSubstitutable API point must document that floatToRawIntBits
  has the processor-dependent ability to conjure up different bits for
  x and y.  Maybe it should be called isAlmostSubstitutable??

The right trade-off here, I think, is to align isSubstitutable with 
Float::equals
and simply increase the warnings on floatToRawIntBits, that this method
can produce platform-specific results in an unpredictable way, and that
in particular it can produce distinct answers for otherwise substitutable
results.

I also suggested to Dr. Deprecator (Stuart Marks) that floatToRawIntBits
might be a candidate for deprecation; he said it would be a lot of expense
for relatively little benefit.  I think at least the javadoc for 
floatToRawIntBits
should not speak so confidently, as it does, of "preserving Not-a-Number
(NaN) values", as if these values were something that had a stable
semantics, as if they could somehow carry application information.

More background (thanks again to Joe):  The NaN bits don't have a standard
format.  Different CPUs can (and often do) disagree on which bits
mean what, and how standard arithmetic operations consume and produce
them.  There is apparently no agreed standard NaN pattern, although
Java favors the "all zero" bit pattern as normative.  Different CPUs
may disagree on which bits denote signaling or quiet NaNs, and when
such bits may be queried or modified.  Adding the possible distinct
treatment of the "same" NaN value in compiled vs. interpreted code
(as well as strictfp vs. non-strictfp code), and the use of "raw" NaN
bits seems a very risky proposition, useful only for people writing
processor-specific code, with great care.

In hindsight, I think it would have been nice to place the "raw bits"
API points to Unsafe or a separate module.  But when those API
points were designed (1.0), there were no such hiding places.
And it's probably too costly to fix now.  If the sweet spot is to
acknowledge the wart, but not let it spread, then we design the
substitutability test based on Float::equals, not floatToRawIntBits.

— John

On Jan 6, 2019, at 4:36 PM, Hans Boehm  wrote:
> 
> IIUC,  isSubstitutible() is not quite what's being proposed here. The
> proposed definition here uses floatToIntBits(), not floatToRawIntBits().
> 
> Hans
> 
> On Sun, Jan 6, 2019 at 3:59 PM Brian Goetz  wrote:
> 
>> Followers of Project Valhalla will see that this issue comes up when
>> defining equality on value types.  The relation you are looking for is
>> being called "substitutible"; it asks whether there is any way to
>> distinguish between two values.  For primitives other than float/double,
>> this coincides with `==`, and similarly for references.
>> 
>> An `isSubstitutible()` API point will likely emerge from Valhalla.



Re: Extending Java Arrays/Collection Sort API

2018-11-28 Thread John Rose
As you noticed, I jumped into the discussion mid-stream.

I've been itching to get index-based sorting into JDK ever since
I tried to code Burrows-Wheeler on the JDK and saw a huge footprint
overhead due to the need for a proxy object for each byte position
in the file to be compressed.

Here are some more direct responses, in the spirit of Brian's suggestion
that we are in an exploration stage.

> On Nov 14, 2018, at 12:01 AM, Laurent Bourgès  
> wrote:
> 
> Hi,
> 
> I am a scientist and the author of the Marlin renderer, integrated in
> OpenJDK.
> 
> In this renderer, I needed a fast sort algorithm on small almost sorted
> arrays, but more important, to deal with 2 arrays: x values and their
> corresponding edge indices (pointer like). I derived my MergeSort class to
> perform 2 swaps (x & edge ptr) as Arrays.sort() or TimSort() did not
> provide such API.

There are two distinct requirements here:  1. Selecting or tweaking a sorting
algorithms for particular expected patterns in the data set (almost-sorted).
2. Sorting two arrays in tandem, with one array supplying the sort key,
and the other serving as a passively permuted payload ("PPP").

Re 1: Another pattern sometimes expected is range-bounded values,
which may prompt consideration of bin-sort.  Perhaps unique values.
Permutation vectors have both properties.

Re 2: In some cases, it's not two arrays that need the work.  Sometimes
it is one array of keys, plus an index set; this can be reduced to two arrays
one of which is an int-array carrying indexes, but in some proposals you see
such an int-array appearing only as an output, with an implicit input of
the iota-vector for the other array's indexes.

More deeply, sometimes the index array is concrete, but the array of keys
is virtualized.  Canonical example (for me) is a set of indexes into a data
set such as a file.  (These indexes are dense for compression algorithms
like B-W, or could be sparse for lexical analysis of bulk text in situ.)  To
handle this properly, a good old index array is just fine, as long as the
sort algorithm *also* accepts an int-comparator.

There is no strong need to generalize to three arrays in tandem, or to
have separate algorithms for treating additional arrays as containing
secondary keys instead of PPP.  This is because given a two-array tandem
sort with PPP, plus an int-array sort with int-comparator, you can compose
many other shapes at a modest footprint cost by adding two index vector
to control the various sorts.

Simplified sketch of generalized tandem array sort:

1. Let L be the common length of the N arrays, A[N][L].
2. Create a new int index array P initialized to iota(L).
3. Create an int-comparator C over 0<=i,j 1. I started discussing about extending the Java Sort API with Vladimir
> Yaroslavskiy, the author of the famous Java DualPivotQuickSort, as he is
> currently improving it.
> 
> His coming DPQS 2018 class provides lot of sort algorithms, I propose to
> make them public API through a facade method in Arrays class:
> insertionSort(), heapSort(), mergeSort(), quickSort()... Of course, his
> current code is dedicated to DPQS, but his algorithm implementation could
> be generalized to propose the Best implementations as a public Java API
> (stable & unstable sorts)
> 
> For example in Marlin, I need trivial insertionSort() in the Helpers class,
> my MergeSort could be replaced by another best implementation.
> Let's not reinvent the wheel…

Indeed, let's not.  But remember that the JDK does not claim to be (and
cannot be) a collection of all the "wheels" we might need.

Here's an extra thought or two on tweakable algorithms:  One algorithm doesn't
fit all data sets although it's amazing how far we've gone in optimizing the 
JDK's
only sort algorithm.  The JDK can't support many new algorithms to go with sort
(and, ahem, parallelSort), as insertionSort, mergeSort, etc., etc.  Doing this 
would
amount to having a bunch of partially usable sort routines, more like 
"tryThiseSort"
"orTryThisIfItDidNotWork", or "dontUseThisSort".  Actually documenting and 
testing
a multi-kinded API for {insertion,merge,parallel,plain}Sort would be a 
nightmare,
IMO, since the properties of sort algorithms are hard to explain and verify.

It would be better if we could define an API for sorting algorithms, that a 
provider
could set up.  Then the various API points in Arrays.*sort(*) could be 
documented
as sugar for specific standard providers, via the API.  The provider API could
include not only first-class entry points for sorting various kinds of arrays or
tandem array pairs, but also provide queries for a user who wanted to be smart
about which provider to choose.

> 2. New Sort API to deal with indices...
> 
> For now, there is no mean to obtain the indices sorted on another value,
> see my introduction: edge indices sorted by their x positions.
> In data science, this is very useful and general.
> 
> What about an Arrays API extension:
> - Arrays.sort(int[] a, 

Re: Extending Java Arrays/Collection Sort API

2018-11-27 Thread John Rose
On Nov 27, 2018, at 9:49 AM, Brian Goetz  wrote:
> 
> Doug is right that there is an enormous potential list of sort methods, and 
> we can't include them all.  But I do miss the ability to extract indexes 
> instead of sorting the array itself.

Or, slightly more generally, sorting an int[] or long[] array with a comparator.
Sometimes you don't want an object per sorted entity; you want some kind
of oracle function that can compare two indexes into a non-concrete
(virtualized) set of values; the sort operation would output the reordered
ints.

Example:  Sorting indexes into a large text file according to some algorithm
such as Burrows-Wheeler.  You don't want to make all the substrings and
put them into a String[] array, and you don't even want to make CharSequence
objects that view into the text file.  You just want indexes into the text file 
to
be sorted.

IMO, the very simplest answer with significantly better semantics is:

  void sort(int[] a, (int[] a, int fromIndex, int toIndex, IntBinaryOperator 
cmp);

And maybe add parallelSort, and use a new IntComparator instead of
repurposing IntBinaryOperator.

Extracting the permutation vector of indexes from an array sorting operation
would then look like this:

public static  int[] sortToIndexes(T[] a, Comparator c) {
int[] perm = new int[a.length];  // starts as an iota vector
for (int i = 0; i < a.length; i++)  perm[i] = i;
sort(perm, 0, a.length, (i, j) -> c.compare(a[i], a[j]));  // NEW 
PRIMITIVE
return perm;
}

But there are other use cases for comparator-based sorting of indexes,
which start not with an iota vector, but with an array of index-like values
which may index into something other than an array (like a text file,
which is why 'long[] a' might be useful too).

In Valhalla many such use cases can be covered by using a flat array of
values which encapsulate the integer (or long) indexes, and sorting that.
The array of wrapped primitives will look like Object[] and so the existing
API point with a comparator would allow the flat array of ints (dressed as
value Objects) to be sorted according to any ad hoc ordering rule, including
treating them as indexes.

So maybe the answer is "wait for Valhalla".  Still, the lack of a comparator
on primitive arrays is an unnecessary limitation, which excludes index-based
computations from the portfolio of our sort methods.

— John

Re: RFR 8206955 MethodHandleProxies.asInterfaceInstance does not support default methods

2018-07-11 Thread John Rose
On Jul 11, 2018, at 9:32 AM, Peter Levart  wrote:
> 
> Sorry Paul for hijacking the thread, just answering to Remi ...
> 
> On 07/11/2018 05:31 PM, Remi Forax wrote:
>> - Mail original -
>>> De: "Peter Levart" 
>>> À: "Paul Sandoz" , "core-libs-dev" 
>>> 
>>> Envoyé: Mercredi 11 Juillet 2018 17:15:09
>>> Objet: Re: RFR 8206955 MethodHandleProxies.asInterfaceInstance does not 
>>> support default methods
>>> Hi Paul,
>>> 
>>> The patch looks ok. I hope IMPL_LOOKUP has access to all methods (even
>>> if located in package-private interfaces and/or in concealed packages of
>>> modules)?
>>> 
>>> Just a thought... Would it be possible to implement this API in terms of
>>> LambdaMetafactory ?
>>> 
>>> Regards, Peter
>> Hi Peter,
>> not with the current LambdaMetaFactory, the LambdaMetaFactory only accept 
>> some kind of method handles (constant method calls) not all kind of method 
>> handles.
>> 
>> That said the current implementation of MethodHandleProxies is very raw and 
>> not very efficient, we should use the same idea as the lambda meta factory 
>> i.e spin an anonymous class and use the mechanism of constant patching offer 
>> by unsafe.defineAnonymousClass to inject the method handle into proxy so it 
>> will work with any method handle.
>> 
>> For each interface, you should cache the bytecode of the anonymous class you 
>> want to load and use defineAnonymousClass with the method handle each time 
>> asInterfaceInstance is called.
> 
> If the generated class used invokeExact on the method handle, bytecode should 
> be generated specifically for each tuple (interface type, method handle 
> type), as the needed conversions of arguments/return values would be specific 
> for each distinct combination of the two types.
> 
> ...which would still mean that you would define new anonymous class for each 
> method handle instance, just the bytecodes would be generated once per 
> (interface type, method handle type) combination.
> 
> The method handle could then be constant-folded in the generated class, but 
> selection of the underlying proxy class would still be governed by the proxy 
> instance which would be invoked via the interface method on the functional 
> interface. If the proxy instance could not be constant-folded (i.e. was not 
> assigned to static final field and used from it), the combined invocation 
> performance would still not be the same as using invokeExact on the constant 
> method handle, would it?
> 
> So perhaps for this API it is more suitable to:
> 
> - define the specific proxy class once per (interface type, method handle 
> type) combination (and cache the class itself, not just bytecode)
> - have that proxy class implement a constructor taking the method handle and 
> assign it to a @Stable instance field
> - implement the single interface method as parameter/return value conversions 
> around invokeExact on the method handle taken from @Stable instance field
> 
> If such proxy instance was constant-folded, so would be the @Stable method 
> handle field, right?
> 
> What do you think of this strategy?


Here's my $0.02:

- Performance does not seem to be important yet for MHP; Paul is mainly 
concerned with functionality.
- But such implementation strategies are desirable for performance.
- The ClassSpecializer mechanism is a good candidate for managing the tuple 
species (it was factored from BMH).
- I think a reasonable balance would be for a MHP instance to factor into (a) a 
@Stable MH (or List::of), and (b) a @Stable tuple.

Downsides of factoring MHP into MH and tuple sub-objects:

- Three heap nodes instead of the minimum single node.
- Indirection costs.  (Note that constants fold through @Stable links including 
List::of or stable-array.)

Upsides:

- Fewer generated classes:  One per interface, one per payload tuple, one per 
MH.
- Possible shared code with publicly-visible tuple carrier API (probably needed 
for pattern "extractors").

More ideas:

An alternative is to do a full-custom species for every distinct MHP 
combination.  This might be OK,
until people start to use MHP in earnest; then it will cause startup problems 
IMO.  Factoring the
MHP nodes makes the thing scale better in terms of static code, at a modest 
cost in heap flatness.

If (long-term if) we want to flatten the MPH nodes, but only for hot-path MHPs 
(assuming in the long
term a wide variety of usage), then we might try Vladimir's MH customization 
hack, which puts
invocation counters on MH transform chains, and upgrades them in-place to use 
customized LFs
if they get hot.  Something like this could be done with MHP combos that turn 
hot.  Perhaps existing
MHPs that get hot could even be internally converted to delegate by one patched 
hop to a flattened
version of the object, and new MHPs of the same syndrome would just generate 
the flattened
version from the start, sans secret indirection.

— John

Re: ImmutableCollections.MapN optimisation ?

2018-06-30 Thread John Rose
On Jun 30, 2018, at 8:39 AM, Martin Buchholz  wrote:
> 
>> On Sat, Jun 30, 2018 at 3:54 AM, Remi Forax  wrote:
>> 
>> The other problem is more subtle, get() uses probe() and probe() uses an
>> infinite loop and not a counted loop, the result is that c2 generates  lot
>> of assembly codes for probe() as a consequence it doesn't inline probe() in
>> get() making get() really slow compared to HashMap.get() which is fully
>> inlined.
>> There are several wys to fix that:
>> - you can try to transform the while(true) loop in probe into 2
>> subsequents loops from idx to length + from 0 to idx.
>> 
> 
> In ArrayDeque we had success transforming loops into 2 nested loops, the
> inner one being hotspot friendly, and it seems like the same approach would
> work for probe().
> E.g. look at ArrayDeque#contains

That was my thought too. Best case is that the JIT unrolls both levels of loop. 
 Test it though; YMMV.  

I like to make the outer loop iterate from false to true, as a style thing. 
When it unrolls you get two customized inner loops.

– John


Re: RFR 8195650 Method references to VarHandle accessors

2018-06-25 Thread John Rose
Good fix. Reviewed. 

> On Jun 25, 2018, at 9:11 AM, Paul Sandoz  wrote:
> 
> Gentle reminder.
> 
> I would like to get this reviews and pushed before the ramp down phase one 
> kicks in this week.
> 
> Paul.
> 
>> On Jun 19, 2018, at 5:08 PM, Paul Sandoz  wrote:
>> 
>> Hi,
>> 
>> Please review the following fix to ensure method references to VarHandle 
>> signature polymorphic methods are supported at runtime (specifically the 
>> method handle to a signature polymorphic method can be loaded from the 
>> constant pool):
>> 
>> http://cr.openjdk.java.net/~psandoz/jdk/JDK-8195650-varhandle-mref/webrev/ 
>> 
>> 
>> I also added a “belts and braces” test to ensure a constant method handle to 
>> MethodHandle::invokeBasic cannot be loaded if outside of the j.l.invoke 
>> package.
>> 
>> Paul.
>> 
> 



Re: [core-libs] RFR (L): 8010319: Implementation of JEP 181: Nest-Based Access Control

2018-06-20 Thread John Rose
Good; I like the care to distinguish "nested" classes (using the
word "enclosing") from the newer concept of "nests" and "nestmates",
as well as the careful framing of how stuff bubbles up from the classfile.

(Kudos to Alex!)

— John

On Jun 19, 2018, at 9:56 PM, David Holmes  wrote:
> 
> Some further adjustments to getNestMembers() was made. Everything updated in 
> place.
> 
> Thanks,
> David
> 
> On 20/06/2018 9:30 AM, David Holmes wrote:
>> Sorry another update is imminent ... stay tuned.
>> David
>> On 19/06/2018 2:41 PM, David Holmes wrote:
>>> Discussions on the CSR request have led to further changes to the 
>>> documentation involving nests and the new nest-related method. There are no 
>>> semantic changes here just clearer explanations.
>>> 
>>> Incremental webrev: 
>>> http://cr.openjdk.java.net/~dholmes/8010319-JEP181/webrev.corelibs.v7-incr/ 
>>> 
>>>  
>>> 



Re: BiCollector

2018-06-18 Thread John Rose
On Jun 18, 2018, at 2:29 PM, Brian Goetz  wrote:
> 
> "bisecting" sounds like it sends half the elements to one collector and half 
> to the other …

The main bisection or splitting operation that's relevant to a stream is what
a spliterator does, so this is a concern.

Nobody has mentioned "unzipping" yet; this is a term of art which applies to 
streams
of tuples.  The image of a zipper is relatively clear and unambiguous, and the 
tradition
is pretty strong.  https://en.wikipedia.org/wiki/Convolution_(computer_science)

The thing we are looking at differs in two ways from classic "unzipping":  
First, the
two collectors themselves convert the same T elements to whatever internal value
(T1, T2) is relevant.  Second, we are looking at a new terminal operation (a 
collector) which
consolidates the results from both of streams (a notional Stream and 
Stream,
if you like), rather than delivering the streams as a pair of outputs.

The classic "unzip" operation applies "fst" and "snd" (or some other 
conventional
set of access functions) to each T-element of the input stream.  Since we don't
have a privileged 2-tuple type (like Pair) in Java, the user would need
to nominate those two functions explicitly, either by folding them into a 
"mapping"
on each collector, or as a utility overloading like this:

   unzipping(
Function f1,  // defaults to identity
Collector c1,
Function f2,  // defaults to identity
Collector c2,
BiFunction finisher) {
 return toBoth(mapping(f1, c1), mapping(f2, c2));
  }


> "tee" might be a candidate, though it doesn't follow the `ing convention.  
> "teeing" sounds dumb.


"tee" sounds asymmetrical.  "diverting" or "detouring" are "*ing" words that 
might
express asymmetrical disposition of derivative streams.

An asymmetrical operation might be interesting if it could fork off a stream of
its own.  It would have to have a side-effecting void-producing terminal 
operation,
so the main (undiverted) stream could continue to progress at the top level of
the expression.

interface Stream {
  default Stream diverting(Consumer> tee) { … }
}

values.stream().diverting(s2->s2.forEach(System.out::println)).filter(…).collect(…);

Or (and this might be a sweet spot) a symmetric stream-tee operation could
materialize two sibling streams and rejoin their results with a bifunction:

class Collectors {
  static  Stream unzipping(
Function, R1> f1,
Function, R2> f2,
BiFunction finisher) { … }
}

values.stream().unzipping(
s1->s1.forEach(System.out::println),
s2->s2.filter(…).collect(…),
(void1, r2)->r2
);

This would allow each "fork child" of the stream to continue to use the
Stream API instead of the more restrictive Collector operators.

Optimal code generation for forked/unzipped/teed streams would be tricky,
requiring simultaneous loop control logic for each stream.
To me that's a feature, not a bug, since hand-writing ad hoc
simultaneous loops is a pain.

My $0.02.

— John

Re: The store for byte strings

2018-06-09 Thread John Rose
On Jun 9, 2018, at 12:18 PM, Xueming Shen  wrote:
> 
> Ideally I would assume we would want to have a utf-8 internal storage for
> String, even in theory utf8 is supposed to be used externally and utf16
> to be the internal one.

Separately from my point about ByteSequence, I agree that "doubling down"
on Utf8 as a standard format for packed strings is a good idea.  A reasonable
way to prototype right now would be an implementation of CharSequence
that is backed by a byte[] (eventually ByteSequence) and has some sort of
fast access (probably streaming) to Utf16 code points.  To make it pay for
itself the Utf8 encoding should be applicable as an overlay in as many places
as possible, including slices of byte[] and ByteBuffer objects, and later
ByteSequences.

> Defensive copy when getting byte[] in & out of String object seems still
> inevitable for now, before we can have something like "read-only" byte[],
> given the nature of its immutability commitment.

We didn't need frozen char[] arrays to avoid defensive copying of String
objects, only an immutability invariant on the class.  We could pull a similar
trick with Utf8 by supplying a ByteSequence view of a String's underlying
bytes.  If the String has underlying chars (Utf16) a view is also possible,
although it is more difficult to get right (as you described).

— John

Re: The store for byte strings

2018-06-09 Thread John Rose
I'm glad to see you are thinking about this, Florian.

You appear to be aiming at a way to compactly store and manipulate
series of octets (in an arbitrary encoding) with an emphasis on using
those octets to represent strings, in the usual sense of character sequences.

Would you agree that this design problem factors well into a generic
problem of storing and manipulating octet sequences, plus a detachable
upper layer that allows strings (in various encodings) to be extracted
from those sequences?  I think the sweet spot here is to introduce
a "stringy but char-free" API which commits to dealing with chunks
of memory (viewed as octet sequences), regardless of how those
chunks will be interpreted.

In https://bugs.openjdk.java.net/browse/JDK-8161256 I discuss
this nascent API under the name "ByteSequence", which is analogous
to CharSequence, but doesn't mention the types 'char' or 'String'.

By "stringy" I mean that there are natural ways to index or partition an
existing sequence of octets, or concatenate multiple sequences into
one.  I also mean that immutability plays a strong role, enabling
algorithms to work without defensive copies.  Making it an interface
like CharSequence means we can use backing stores like ByteBuffer
or byte[], or more exotic things like Panama native memory, interoperably.

Here are some uses for an immutable octet sequence type:

 - manipulation of non-UTF16 character data (which you mention)
 - zero copy views (slices, modifiable or not) into existing types (ByteBuffer, 
byte[], etc.)
 - zero copy views into file images (N.B. requires a 'long' size property, not 
'int')
 - zero copy views to intra-classfile resources (CONSTANT_Bytes)
 - backing stores for Panama data structures and smart pointers
 - copy-reduced scatter and gather nodes associated with packet processing
 - octet-level cursors for parsers, scanners, packet decoders, etc.

If the ByteSequence views are value instances, they can be created
at a very high rate with little or no GC impact.  Generic algorithms
would still operate on them 

A mutable octet sequence class, analogous to StringBuilder, would
allow immutable sequences to be built with fewer intermediate copies,
just like with StringBuilder.

If the API is properly defined it can be inserted directly into existing types
like ByteBuffer.  Doing this will probably require us to polish ByteBuffer
a little, adding immutability as an option and lifting the 32-bit limits.
It should be possible to "freeze" a ByteBuffer or array and use it as
a backing store that is reliably immutable, so it can be handed to
zero-copy algorithms that work with ByteSequences. For some of
this see https://bugs.openjdk.java.net/browse/JDK-8180628 .

Independently, I want to eventually add frozen arrays, including
frozen byte[] arrays, to the JVM, but that doesn't cover zero-copy use
cases; it has to be an interface like CharSequence.

So the option I prefer is not on your list; it would be:

(h) ByteSequence interface with retrofits to ByteBuffer, byte[], etc.

This is more flexible than (f) the concrete ByteString class.  I think
the ByteString you are thinking of would appear as a non-public class
created by a ByteSequence factory, analogous to List::of.

— John

On Jun 9, 2018, at 3:27 AM, Florian Weimer  wrote:
> 
> Lately I've been thinking about string representation.  The world
> turned out not to be UCS-2 or UTF-16, after all, and we often have to
> deal with strings generally encoded as ASCII or UTF-8, but we aren't
> always encoded this way (and there might not even be a charset
> declaration, see the ELF spec).
> 
> (a) byte[] with defensive copies.
>Internal storage is byte[], copy is made before returning it to
>the caller.  Quite common across the JDK.
> 
> (b) byte[] without defensive copies.
>Internal storage is byte[], and a reference is returned.  In the
>past, this could be a security bug, and usually, it was adjusted
>to (a) when noticed.  Without security requirements, this can be
>quite efficient, but there is ample potential for API misuse.
> 
> (c) java.lang.String with ISO-8859-1 decoding/encoding.
>Sometimes done by reconfiguring the entire JVM to run with
>ISO-8859-1, usually so that it is possible to process malformed
>UTF-8.  The advantage is that there is rich API support, including
>regular expressions, and good optimization.  There is also
>language support for string literals.
> 
> (d) java.lang.String with UTF-8 decoding/encoding and replacement.
>This seems to be very common, but is not completely accurate
>and can lead to subtle bugs (or completely non-processible
>data).  Otherwise has the same advantages as (c).
> 
> (e) Various variants of ByteBuffer.
>Have not seen this much in practice (outside binary file format
>parsers).  In the past, it needed deep defensive copies on input
>for security (because there isn't an immutably backed ByteBuffer),
>and shallow copies for 

Re: RFR: CSR JDK-8203630 Add instance method equivalents for String::format

2018-05-23 Thread John Rose
On May 23, 2018, at 2:07 PM, Remi Forax  wrote:
> 
> Please, don't call it format too !
> Trying to do a :: on a class with instance methods and static methods with 
> the same doesn't work well, and format() being a varargs will not help.
> 
> I see several reasons to not introduce this method whatever its name 
> (interpolate ?),
> - String.format is very slow (compared to the String concatenation), 
> providing several new slow methods doesn't seem to go to the right direction
> - the problem can be solved easily by writing a the javadoc sentence saying 
> that one can use a static import, or write an article about String.format(), 
> or better talk with Eclipse/Jetbrains, so they will add a quickfix to do the 
> static import automatically like with assert* in tests.
> - at some point in the future, we may want to add a way to interpolate 
> strings in Java, the runtime support, StringConcatFactory, is already in the 
> JDK.

Refactoring static as non-static methods, or vice versa, is a very
reasonable design move, but the language makes it hard to do this
and retain backward compatibility.  It's a pity, and we could do something
to make this easy.  The basic idea, which I've been plumping for a long
time but hasn't got much love, is to simply allow static methods to be
called using instance method syntax, given appropriate opt-in.  Then
format would be opted-in and all would be well.

class String {
  public static __Fluent String format(String thsi, Object… args) { … }
}

See also:  https://bugs.openjdk.java.net/browse/JDK-8191530
This addresses interface statics, which serve as a stand-in for
non-interface final methods (because you can't override them).
The fact that they *also* require a special invocation syntax
(prefix instead of postfix/fluent) is just an accident of notation,
which should be adjustable by an API designer.

Legacy prefix and new postfix syntaxes would co-exist by
virtue of overload resolution rules, with legacy getting priority
much as with varargs and autoboxing.  We'd have to deprecate
and outlaw existing legacy semantics (for classes not interfaces)
of fluent notation for statics which *discards* the LHS of the dot
after evaluating it.  That's the trickiest bit, but the details are IMO
manageable.

I can dream.

— John

Re: [core-libs] RFR (L): 8010319: Implementation of JEP 181: Nest-Based Access Control

2018-05-21 Thread John Rose
On May 21, 2018, at 5:48 PM, David Holmes  wrote:
> 
> * A nest is a set of classes and interfaces (nestmates) that
> * form an access control context in which each nestmate has access to the
> * private members of the other nestmates.
> * The nest host is the class or interface designated to hold the 
> list of
> * classes and interfaces that make up the nest, and to which each of the
> * other nestmates refer.
> +* All nestmates are implicitly defined in the same runtime package.

FWIW I agree that this would be a fine addition.  Note that this is
not something the API enforces; the JVM requires, as a structural
constraint, that the various nestmates all share a common package
prefix.  Pointing this out in the reflection API doc is useful but not
strictly required; it won't affect compliance or dictate behavior not
already specified (indirectly) elsewhere.

So all of the above text is descriptive and not the primary normative spec.
of how nestmates work; that normative spec. is in the JVMS, not javadoc.
For this reason, it's reasonable and low-risk to add clarifying text like
this; it doesn't change behavior, so much as help users understand it.
When I've written stuff like this in the past, sometimes I make the link
explicit, by saying something like "…as stated in the JVMS, this API
produces…".  Then we have a little classroom session about the
implications of the JVMS.  Not normative, but illuminating.

In fact, making the spec. more user friendly is a good thing to pay
attention to in the implementation review, even after is has passed the
CCC and spec. expert reviews, although I would expect those earlier
reviews to produce a reasonably user-friendly result.  That said, such
things can be added later, and I deeply sympathize with David's
"immutable object" stance.  Also (in a different way) with him having
to field documentation RFEs that are equally applicable across
pre-existing APIs.  I've noticed that when we really dig into our
specs. at new points, we suddenly realize that they could have been
much better all along.  I propose we note this and file a tracking bug,
rather than wedge the improvements into new spec. only.

An *implementation* code review should not, in general, reopen spec.
issues that have already been reviewed.  If I were David in this case,
I'd feel like some of these review points are making about the same
progress as a revolving door does.  I don't know what the full fix is,
but part of it has to be agreeing to partition the problem and work
on one part at a time.  Which we do, of course, as a habit, but it
seems some highly interconnected features like nestmates attract
expansive reviews.  We could say "well, it's a fundamental JVM
change" but does that mean we should also so 100x times this
expansive review process for value types?  If so, it's going to take
longer than any of us want.

And, for smaller RFEs like nestmates, it's going to be hard doing
the next several.  I hope the next few will get through the various
review gauntlets with more ease.  I'm thinking specifically of sealed
classes (and maybe fields), final-means-final, lazy fields, and
frozen arrays.  All of these are, like nestmates, smaller features
that cut across many layers of the system, and so will need multiple
facets of review.  If any (or all) of those happen, let's learn from
this experience and do even better next time.

I'm not unhappy with this experience, but want to do even better.
Buy David a beer to get his viewpoint.  And, thank you to everyone
who has put eyeballs on these reviews; they are crucial to getting
a good outcome.

— John

Re: RFR: jsr166 jdk integration 2018-05

2018-05-19 Thread John Rose
On May 19, 2018, at 9:42 AM, Martin Buchholz  wrote:
> 
> I like thinking of ArrayList as just an optimized HashMap, Lua-style, and
> HashMap.replaceAll also does not increment modCount, supporting our
> "structural modification" position.
> 
> The thing that bothers me most about the status quo is the *inconsistency*
> - between root list and subList, between default implementation and
> concrete implementation, and between ArrayList and HashMap.  And there
> seems to be no reason for users to expect any such inconsistency.

FWIW my $0.02.

Sadly I don’t have time to read this whole CME thread, but it seems to
me that the written spec. has to lead here, not the de facto spec. of
how stuff throws CME today.

And we have to start with the javadoc in CME.java.  Sadly, it is more
allusive than definitive but we have to work with what we have.  When
read closely, as if it were a true spec instead of a mere handful of
examples, supports a pretty narrow interpretation of when CME can be
thrown:

A. When race conditions are detected (not done today at all, but would
be with thread-confined data!).

B. When a backing collection is modified between calls to an iterator
over it.

C. Overlapping cases A and B, when an iterator may experience
“arbitrary, non-deterministic behavior” at some later point.  (This
points, at least, to dangling pointers to inactive backing storage
after a resize, and also to buffering of values in the iterator which
duplicates the backing state.)

D. Finally, a catch-all “sequence of method invocations that violates
the contract of an object”.

The catch-all case D has (maybe) been treated as grounds for adding
lots of additional ad hoc checks in various implementations, but (to
follow the written spec) those ad hoc checks must be clearly
documented as “violating the contract of the object”.  And that means
that the object itself, in its subclass, must have documented a
contract that could be violated by an invalid sequence of calls to
various methods on that object.

At the same time, List (for example) is a very general-purpose object
and as such has a very simple (I’d say “categorical”) contract.  A
subtype of List might have a more rigorous contact and throw more CMEs
but the default for List as a whole the criteria should be as narrow
as possible, subject to the goals A, B, C, above.

This CME thread (the parts on the side of more CMEs) appeals to the
List specification where it says “otherwise perturb it in such a
fashion that iterations in progress may yield incorrect results”,
saying that when we spot a result that seems obviously incorrect, we
are permitted to throw a CME.

But I think in order to follow the spec. we need to insist that the
phrase “incorrect results” has a definite and documented meaning, or
else that it refers (as CME.java does) to some hypothetical subtype of
List that has additional contracts.  The phrase “incorrect results” in
List cannot be carte blanche to add checks where we are afraid the
programmer might have made an error even if we cannot prove one; that
would be just officious mind-reading.

I don’t know whose side this favors more, but I think we all can agree
that a careful and disciplined reading of the spec. will sort us out,
and that a narrow construction of the spec. evidence is probably safer
than a broad one.

Final point: Since the CME spec. says it is a best-effort bug
detection, we have nothing to lose from removing CME throws, if
consistency is the most important goal.  Adding new CME throws surely
must be justified by a careful, conservative reading of the existing
spec., which is what I have tried to provide in this note.

HTH
— John

Re: RFR(M): 8202745: Remove hyphens from "out-of-bounds".

2018-05-07 Thread John Rose
On May 7, 2018, at 2:43 PM, David Holmes  wrote:
> 
> The grammar can be a bit subtle here. IIUC we would say:
> 
> "Index %d out of bounds for length %d"
> 
> but if we turn it around we'd say:
> 
> "out-of-bounds index %d for length %d"

+1  And the hyphens could be dropped altogether,
which I think reads better, usually.

(Grammarian hat donned.)

You _may_ use "out-of-bounds" when it modifies a noun.
The form "out-of-bounds" is a compound adjective.
It is the compound-adjective form of "out of bounds".

See Rule 1 of this page:
  https://www.grammarbook.com/punctuation/hyphens.asp

Rule 10 on the same page says compound adjectives
that are well known do not need hyphens.

A compound adjective usage can omit hyphens if it will
be well understood.

A compound-adjective usage can include hyphens if it
might not be well-understood.

Thus, hyphenating the compound adjective is sometimes
a matter of taste.

Here's bad grammar:  "Hyphenating the compound-adjective
is sometimes a matter-of-taste."  Hyphenating nouns is
only correct if the noun specifically requires it, like the
numbers between twenty-one and ninety-nine, as
the noted grammarian Salman Al-Azami might observe.

(Grammarian hat doffed.)



Re: Hashing files/bytes Re: RFR(JDK11/NIO) 8202285: (fs) Add a method to Files for comparing file contents

2018-05-01 Thread John Rose
Here's another potential stacking:

Define an interface ByteSequence, similar to CharSequence,
as a zero-copy reference to some stored bytes somewhere.
(Give it a long length.)  Define bulk methods on it like hash
and mismatch and transferTo.  Then make File and ByteBuffer
implement it.  Deal with the cross-product of source and
destination types underneath the interface.

(Also I want ByteSequence as a way to encapsulate resource
data for class files and condy, using zero-copy methods.
The types byte[] and String don't scale and require copies.)

— John

On May 1, 2018, at 3:04 PM, fo...@univ-mlv.fr wrote:
> 
> - Mail original -
>> De: "Paul Sandoz" 
>> À: "Remi Forax" 
>> Cc: "Alan Bateman" , "nio-dev" 
>> , "core-libs-dev"
>> 
>> Envoyé: Mardi 1 Mai 2018 00:37:57
>> Objet: Hashing files/bytes  Re: RFR(JDK11/NIO) 8202285: (fs) Add a 
>> method to Files for comparing file contents
> 
>> Thanks, better then i expected with the transferTo method we recently added, 
>> but
>> i think we could do even better for the ease of use case of “give me the hash
>> of this file contents or these bytes or this byte buffer".
> 
> yes, it can be a nice addition to java.nio.file.Files and in that case the 
> method that compare content can have reference in its documentation to this 
> new method.
> 
>> 
>> Paul.
> 
> Rémi
> 
>> 
>>> On Apr 30, 2018, at 3:23 PM, Remi Forax  wrote:
>>> 
 
 To Remi’s point this might dissuade/guide developers from using this 
 method when
 there are other more efficient techniques available when operating at 
 larger
 scales. However, it is unfortunately harder that it should be in Java to 
 hash
 the contents of a file, a byte[] or ByteBuffer, according to some chosen
 algorithm (or a good default).
>>> 
>>> it's 6 lines of code
>>> 
>>> var digest = MessageDigest.getInstance("SHA1");
>>> try(var input = Files.newInputStream(Path.of("myfile.txt"));
>>> var output = new DigestOutputStream(OutputStream.nullOutputStream(), 
>>> digest)) {
>>>   input.transferTo(output);
>>> }
>>> var hash = digest.digest();
>>> 
>>> or 3 lines if you don't mind to load the whole file in memory
>>> 
>>> var digest = MessageDigest.getInstance("SHA1");
>>> digest.update(Files.readAllBytes(Path.of("myfile.txt")));
>>> var hash = digest.digest();
>>> 
 
 Paul.
>>> 
>>> Rémi



Re: RFR(JDK11/NIO) 8202285: (fs) Add a method to Files for comparing file contents

2018-04-30 Thread John Rose
On Apr 30, 2018, at 4:47 PM, Joe Wang  wrote:
> 
> Are there real-life use cases?  It may be useful for example to check if the 
> files have the same header.

After equality comparison, lexical comparison is a key use case.
By allowing the user to interpret the data around the mismatch,
the comparison can be made sensitive to things like locales.

As Paul implies, finding a mismatch is the correct operation to build
equality checks on top of, because (a) a mismatch has to be detected
anyway to prove inequality, and (b) giving the location of the mismatch,
instead of throwing it away, unlocks a variety of other operations.

If you want real-life use cases, look at uses of /usr/bin/cmp in Unix
shell scripts.  The cmp command is to Unix files what Paul's array
mismatch methods are to Java arrays.  Here's a man page reference:

https://docs.oracle.com/cd/E19683-01/816-0210/6m6nb7m6c/index.html

As with the array mismatch methods, the cmp command allows the
user to specify optional offsets within each file to start comparing, as
well as an optional length to stop comparing after.

See the file BufferMismatch.java for the (partial) application of these
ideas to NIO buffers.

I suppose the Java-flavored version of "cmp - file" would be a file
comparator which would take a byte buffer as a second operand,
and return an indication of the location of the mismatch.  Note that
"cmp - file" compares a computed stream against a stored file.

I think Paul and I have sketched a natural "sweet spot" for performing
bitwise comparisons on stored data.  It's up to you how much to implement.
I suggest that, if you don't feel inspired to do it all in one go, that you
leave room in the code for future expansions (maybe as with
BufferMismatch), and perhaps file a follow-up RFE.

— John



Re: RFR: JDK-8202105: jshell tool: on exiting, terminal echo is disabled

2018-04-25 Thread John Rose
On Apr 25, 2018, at 12:07 PM, Martin Buchholz  wrote:
> 
> For bonus points, only create the shutdown hook the first time readPassword
> is called with echo on to appease the Emacs shell users with pitchforks.

FTR, I use Emacs shell for everything except jshell, and have to
launch a separate shell app. (Terminal) to run jshell.  That's the only
reason I launch that app.  My pitchfork is still in the shed, but I feel
the pain here too.  There's probably a config. switch I'm missing,
but jshell has never worked for me under Emacs.



Re: RFR 8199875: Require first parameter type of a condy bootstrap to be Lookup

2018-04-06 Thread John Rose
Reviewed; it's good.

The javadoc text doesn't need to predict the future; it just needs to document
the present specification.  So the sentence that begins "This constraint allows 
for
the future possibility…" is not really necessary.  It's certainly not normative.

— John

On Apr 6, 2018, at 5:15 PM, Paul Sandoz  wrote:
> 
> Hi,
> 
> Please review this patch to constrain constant dynamic bootstrap methods to 
> methods whose first parameter type is MethodHandles.Lookup.
> 
>  http://cr.openjdk.java.net/~psandoz/jdk/JDK-8199875-condy-bsm-lookup/webrev/ 
> 
> 
> We are conservatively diverging from invoke dynamic bootstrap method 
> invocation behaviour to possibly diverge further in the future and allow for 
> constant dynamic bootstrap methods that are invoked without the 
> lookup/name/type arguments. The change enables further divergence in a future 
> release without breaking compatibility.
> 
> This would make it easier to use existing methods as bootstrap methods rather 
> than invoking via a level of indirection for explicit wrappers or using 
> ConstantBootstraps.invoke. The experience garnered from prototyping language 
> and low-level library features informs us this is useful.
> 
> CSR is here:
> 
>  https://bugs.openjdk.java.net/browse/JDK-8201268 
> 
> 
> Thanks,
> Paul.



Re: RFR: Some patches for sherman

2018-03-30 Thread John Rose
In short, Peter is right; sorry Martin.

That annotation is private to java.base and not currently a feature that
has support beyond its uses in java.base.  The purpose of @Stable is
to allow lazily evaluated variables (both static and non-static) to be
constant-folded like static finals.  In the case of a non-static @Stable
variable, the containing reference must first constant-fold in order for
the @Stable field to fold.  (The rules for arrays are even trickier.)

The optimizations for @Stable are not aimed at proving properties
of @Stable fields inside non-constant objects (or non-c. arrays).
But you are right, Martin, that the rules for @Stable would allow
the optimization you are asking for.  If we ever do lazy evaluation
as a first class language feature (I can dream), we'll want both
kinds of optimizations to be routine.

BTW, note that the rules of @Stable do not guarantee stability
of "z.res" unless "z.res" has been observed to be non-null in
a dominating code path.  As long as it has its default value
of null, the variable not yet guaranteed to be constant.  To
optimize this correctly we would probably need loop predication
(or perhaps peeling) to optimistically sample z.res and ensure
that it is non-null before going into the real loop.  I see that
"z.res" is a tuple of some sort, so it's likely that normal loop
code will null-check it implicitly, which is a start.  If the loop
peels for other reasons, then the stable invariant property
would fall out "for free".  But as I said, the JIT is not looking
for @Stable fields except inside of constant (not just
loop-invariant) objects.

I don't recommend using @Stable except as a way to extend
the reach of constant folding.  Proving other properties like
loop invariance isn't on the list yet.

The docs for @Stable don't spell this out as clearly as one
might want, but this sentence comes closest:  "More specifically,
the HotSpot VM will process non-null stable fields (final or
otherwise) in a similar manner to static final fields with respect to
promoting the field's value to a constant."

— John

On Mar 30, 2018, at 3:49 PM, Martin Buchholz  wrote:
> 
> The qualified people may not be reading this mailing list.  Adding 
> authors/reviewers of Stable.java.  The question is whether in 
> 
> 
> final ZipFile z = ...
> while (...) {
>   use(z.res); 
> }
> 
> 
> annotating instance field ZipFile.res as @Stable will enable the optimization 
> of reading the res field only once.
> 
> 
> On Fri, Mar 30, 2018 at 1:16 AM, Peter Levart  > wrote:
> I'm forwarding this private message to core-libs-dev so a qualified person 
> may read it...
> 
> Regards, Peter
> 
> 
>  Forwarded Message 
> Subject:Re: RFR: Some patches for sherman
> Date:   Fri, 30 Mar 2018 06:18:26 +
> From:   Peter Levart >
> To: Martin Buchholz >
> 
> 
> 
> Hi Martin,
> 
> On Fri, 30 Mar 2018, 05:28 Martin Buchholz,    >> wrote:
> 
> 
> 
>On Thu, Mar 29, 2018 at 7:57 AM, Peter Levart
> 
> >> wrote:
> 
>Hi Martin,
> 
>On 03/27/2018 02:21 AM, Martin Buchholz wrote:
> 8200124: Various cleanups in jar/zip
> http://cr.openjdk.java.net/~martin/webrevs/jdk/zip-cleanup/ 
> 
>  >
> https://bugs.openjdk.java.net/browse/JDK-8200124 
> 
> 
>As I already mentioned for your Thread cleanup patch, do you
>think that ZipFile objects are something that ever gets assigned
>to static final or @Stable fields? Only in that case would
>@Stable have any effect for ZipFile fields. @Stable should
>really be reserved for classes which instances are normally
>assigned to static final fields, such as ThreadLocal,
>ClassValue, etc... If the holder of the @Stable field can not be
>made constant, then neither can the value of that field.
> 
> 
>Peter, I was very surprised by your reply.  I re-read the doc for
>Stable and it says to me that in
> 
>ZipFile z = ...
> 
>while (...) {
>   z.res ...
>}
> 
>z.res only needs to be read once.
> 
> 
> This is true even without @Stable on res field. And JIT already does all it 
> can without it. What JIT cannot do in above situation is make the value of 
> res a compile-time constant embedded in generated code, possibly triggering 
> other optimizations that arrise from the 

Re: Raw String Literal Library Support

2018-03-14 Thread John Rose
On Mar 14, 2018, at 6:11 AM, Peter Levart  wrote:
> 
> Pattern.compile(string)
> 
> Now if 'string' above is a constant, '~ string' could be a constant too. 
> Combined with raw string literals, Pattern constants could be very compact.
> 
> 
> What do you think?

There's no need to introduce syntax in order to gain constant folding.

It's enough to ensure that Pattern.compiler(constant) reduces to
the ldc of a dynamic constant.  We are experimenting on such
ideas here:
  http://hg.openjdk.java.net/amber/amber/shortlog/condy-folding 


(This is very vaguely similar to constexpr in C++, but less static.
It's early days, but enough to show that syntax isn't necessary.)

— John

Re: Raw String Literal Library Support

2018-03-13 Thread John Rose
On Mar 13, 2018, at 6:47 AM, Jim Laskey  wrote:
> 
> …
> A. Line support.
> 
> public Stream lines()
> 

Suggest factoring this as:

 public Stream splits(String regex) { }
 public Stream lines() { return splits(`\n|\r\n?`); }

The reason is that "splits" is useful with several other patterns.
For raw strings, splits(`\n`) is a more efficient way to get the same
result (because they normalize CR NL? to NL).  There's also a
nifty unicode-oriented pattern splits(`\R`) which matches a larger
set of line terminations.  And of course splits(":") or splits(`\s`) will
be old friends.  A new friend might be paragraph splitting splits(`\n\n`).

Splitting is old, as Remi points out, but new thing is supplying the
stream-style fluent notation starting from a (potentially) large string
constant.

> B. Additions to basic trim methods. In addition to margin methods trimIndent 
> and trimMarkers described below in Section C, it would be worth introducing 
> trimLeft and trimRight to augment the longstanding trim method. A key 
> question is how trimLeft and trimRight should detect whitespace, because 
> different definitions of whitespace exist in the library. 
> ...
> That sets up several kinds of whitespace; trim's whitespace (TWS), Character 
> whitespace (CWS) and the union of the two (UWS). TWS is a fast test. CWS is a 
> slow test. UWS is fast for Latin1 and slow-ish for UTF-16. 

For the record, even though we are not talking performance much,
CWS is not significantly slower than UWS.  You can use a 64-bit int
constant for a bitmask and check for an arbitrary subset of the first
64 ASCII code points in one or two machine instructions.

> We are recommending that trimLeft and trimRight use UWS, leave trim alone to 
> avoid breaking the world and then possibly introduce trimWhitespace that uses 
> UWS.

Putting aside the performance question, I have to ask if compatibility
with TWS is at all important.  (Don't know the answer, suspect not.)
> …
> C. Margin management. With introduction of multi-line Raw String Literals, 
> developers will have to deal with the extraneous spacing introduced by 
> indenting and formatting string bodies. 
> 
> Note that for all the methods in this group, if the first line is empty then 
> it is removed and if the last is empty then it is removed. This removal 
> provides a means for developers that use delimiters on separate lines to 
> bracket string bodies. Also note, that all line separators are replaced with 
> \n.

(As a bonus, margin management gives a story for escaping leading and trailing
backticks.  If your string is a single line, surround it with pipe characters 
`|asdf|`.
If your string is multiple lines, surround it with blank lines easy to do.  
Either
pipes or newlines will protect backticks from merging into quotes.)

There's a sort of beauty contest going on here between indents and
markers.  I often prefer markers, but I see how indents will often win
the contest.  I'll pre-emptively disagree with anyone who observes
that we only need one of the two.

> public String trimMarkers(String leftMarker, String rightMarker)

I like this function and anticipate using it.  (I use similar things in
shell script here-files.)  Thanks for including end-of-line markers
in the mix.  This allows lines with significant *trailing* whitespace
to protect that whitespace as well as *leading* whitespace.

Suggestion:  Give users a gentle nudge toward the pipe character by
making it a default argument so trimMarkers() => trimMarkers("|","|").

Suggestion:  Allow the markers to be regular expressions.
(So `\|` would be the default.)

> 
> D. Escape management. Since Raw String Literals do not interpret Unicode 
> escapes (\u) or escape sequences (\n, \b, etc), we need to provide a 
> scheme for developers who just want multi-line strings but still have escape 
> sequences interpreted.

This all looks good.

Thanks,

— John

Re: Raw String Literal Library Support

2018-03-13 Thread John Rose
On Mar 13, 2018, at 11:30 AM, Volker Simonis  wrote:
> 
> Would it make sense to have a versions of "lines(LINE_TERM lt)" which
> take a single, concrete form of line terminator?

(Regular expressions for the win!)

Re: Raw String Literal Library Support

2018-03-13 Thread John Rose
On Mar 13, 2018, at 11:42 AM, Remi Forax  wrote:
> 
> it already exists :)
>  Stream stream = Pattern.compile("\n|\r\n|\r").splitAsStream(string);

You want ` instead of " there!

Somebody added support recently for `\R`, which is a more unicode-flavored
version of your pattern (or just `\n`).  Last time I looked it was missing; 
kudos
to whoever added it in.

There should be a fluent streamy syntax for splitting a string, 
string.splits(pat).
Java's incomplete embrace of fluent syntax is old news, *but* there is something
new here:  String expression size.

The raw strings are much larger than classic strings, and so they seem to need 
some
notational assistance that doesn't always require them to be enclosed in round 
parens
and mixed with other arguments.  Having more fluent methods on String seems like
a good move here.

This goes beyond raw strings, and parsing is hard, but maybe there's room for
richer versions of String.lines or String.splits, which can deliver both the 
surrounding
whitespace and one or more fields, for each line (or each paragraph or 
whatever):

public Stream matchResults(String regex) {
  return Pattern.compile(regex).matcher(this).results();
}

The point is that a MatchResult delivers both the whole substring and any groups
embedded in it as part of the match.  Plus indexes, which is nice sometimes.

— John



Re: RFR: 8198755: Reduce cost of InvokerBytecodeGenerator::isStaticallyInvocable/-Nameable

2018-02-27 Thread John Rose
Looks good.

On Feb 27, 2018, at 8:42 AM, Paul Sandoz  wrote:
> 
> 
> 
>> On Feb 27, 2018, at 7:42 AM, Claes Redestad  
>> wrote:
>> 
>> Hi,
>> 
>> when generating LF classes with InvokerBytecodeGenerator, almost 5% of 
>> executed code is concerned with determining if members and parameters are 
>> statically invocable and/or nameable, which we can assert is always true for 
>> MethodHandle arguments.
>> 
> 
> IIRC this is for the case where we can reliably crack open the method handle 
> (from it’s member name) and invoke it directly?

Yes, invoked directly from inside LF code in java.lang.invoke.

The 'member' parameter already gives evidence that somebody
has gotten the right to call the method indirectly via a MH,
and the LF generator wants to lower that to a direct invocation
at bytecode level.

That's why MH package-private members are called directly.

> 
>> Since MethodHandle targets and parameters are common, it's a profitable 
>> startup optimization to add a fast-path test for this.
>> 
>> Webrev: http://cr.openjdk.java.net/~redestad/8198755/jdk.00/
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8198755
>> 
> 
> 
> 957 for (int i = 0; i < mtype.parameterCount(); i++)
> 958 if (!isStaticallyNameable(mtype.parameterType(i)))
> 
> Revert back to: 
> 
> for (Class ptype : mtype.parameterArray())
>  if (!isStaticallyNameable(ptype))
> 
> ?

+1

— John

Re: [11] RFR 8195694: ConstantBootstraps.invoke does not preserve variable arity

2018-02-01 Thread John Rose
On Feb 1, 2018, at 12:45 AM, Paul Sandoz <paul.san...@oracle.com> wrote:
> 
> 
> 
>> On Jan 31, 2018, at 3:49 PM, John Rose <john.r.r...@oracle.com> wrote:
>> 
>> On second thought, you should also use invokeWithArguments to support jumbo 
>> arities. 
>> 
> 
> It does, but non-selectively based on the arity:
> 
> 245 return handle.invokeWithArguments(args);
> 
> 
>> This tricky idiom should be put into a utility method, package private for 
>> starters. A version of it also appears in BSM invocation code. 
>> 
> 
> Are you in part referring to the approach of switching on the number of 
> arguments and using invoke with unpacking for small cases?

That might be worth doing as an internal optimization in a reusable library 
function. But my main concern is correctness: packaging a tricky idiom instead 
of having users re-derive it by noticing bugs in the near-miss approximations. 

We put in withVarargs to capture some of the idiom, but we aren’t there yet. 
The big you are fixing had happened in several places and the root cause is the 
lack of an advertised best practice. Hence my interest in a utility method, 
eventually a public one. 

> 
> If you don’t object i would like to follow up on that with another issue.

Sure. 

> 
> 
>>> On Jan 31, 2018, at 3:23 PM, John Rose <john.r.r...@oracle.com> wrote:
>>> 
>>> If you remove the old asType call it’s good!
>>> 
> 
> 
> Ah! something went wrong when importing the patch from the amber repo. 
> Updated in place.
> 
> Paul.
> 
>>>> On Jan 31, 2018, at 3:15 PM, Paul Sandoz <paul.san...@oracle.com> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> Please review this fix to the invoke BSM so that it preserves variable 
>>>> arity, if any:
>>>> 
>>>> http://cr.openjdk.java.net/~psandoz/jdk/JDK-8195694-constant-bsms-invoke-arity/webrev/
>>>>  
>>>> <http://cr.openjdk.java.net/~psandoz/jdk/JDK-8195694-constant-bsms-invoke-arity/webrev/>
>>>> 
>>>> This will be pushed to the hs repo.
>>>> 
>>>> Thanks,
>>>> Paul.
>>> 
>> 
> 



Re: [11] RFR 8195694: ConstantBootstraps.invoke does not preserve variable arity

2018-01-31 Thread John Rose
On second thought, you should also use invokeWithArguments to support jumbo 
arities. 

This tricky idiom should be put into a utility method, package private for 
starters. A version of it also appears in BSM invocation code. 

> On Jan 31, 2018, at 3:23 PM, John Rose <john.r.r...@oracle.com> wrote:
> 
> If you remove the old asType call it’s good!
> 
>> On Jan 31, 2018, at 3:15 PM, Paul Sandoz <paul.san...@oracle.com> wrote:
>> 
>> Hi,
>> 
>> Please review this fix to the invoke BSM so that it preserves variable 
>> arity, if any:
>> 
>> http://cr.openjdk.java.net/~psandoz/jdk/JDK-8195694-constant-bsms-invoke-arity/webrev/
>>  
>> <http://cr.openjdk.java.net/~psandoz/jdk/JDK-8195694-constant-bsms-invoke-arity/webrev/>
>> 
>> This will be pushed to the hs repo.
>> 
>> Thanks,
>> Paul.
> 



Re: [11] RFR 8195694: ConstantBootstraps.invoke does not preserve variable arity

2018-01-31 Thread John Rose
If you remove the old asType call it’s good!

> On Jan 31, 2018, at 3:15 PM, Paul Sandoz  wrote:
> 
> Hi,
> 
> Please review this fix to the invoke BSM so that it preserves variable arity, 
> if any:
> 
>  
> http://cr.openjdk.java.net/~psandoz/jdk/JDK-8195694-constant-bsms-invoke-arity/webrev/
>  
> 
> 
> This will be pushed to the hs repo.
> 
> Thanks,
> Paul.



Re: [PATCH] Inefficient ArrayList.subList().toArray()

2018-01-29 Thread John Rose
On Jan 29, 2018, at 4:00 PM, John Rose <john.r.r...@oracle.com> wrote:
> 
> Then the various array-backed implementations
> (and their sublists) would simply override their respective
> stream views.  

BTW, this notion is more general than you might think.
It applies to collections, like the internal spined buffer,
which are backed by sequences of arrays.  You use
the spliterator of the array-backed stream to find
the "cracks" between the backing arrays.  After
subdividing appropriately, you get Arrays.copyOf
and its friends at the leaves of the outer loop.

As I said, I *think* this can work, and if it does
it will address the issue of duplicate or slow copies
more systematically than slowly expanding but
incomplete set of point-fixes, like the one under
review.


Re: [PATCH] Inefficient ArrayList.subList().toArray()

2018-01-29 Thread John Rose
Reviewed (officially).  This is a good point-fix.

— John

P.S.  I still think we have some tech. debt to discharge
in finding a way to generically provide zero-copy access
to array data, across interoperating collections APIs.

If we got the deeper, more general answer right, we would
be able to reformulate the present point-fix in terms of a
single viewing operation, and then apply it in other places
(like Arrays.asList) without more code duplication.

I think (though am not sure yet) that the Collection::stream
API point is the right place to inject zero-copy array viewing
capabilities.  A specialized array-backed Stream (SIZED,
of course) would provide a general but efficient connection
point, that would allow private arrays to be securely read
but not written.  Then the various array-backed implementations
(and their sublists) would simply override their respective
stream views.  The copyOfRange step would appear,
not in a bunch of point fixes, but centrally in the array-backed
stream code.  A package-private class could mediate
secret access to the backing array, if necessary.

See also this discussion about generic array-producing terminal
methods, near this message:
  http://mail.openjdk.java.net/pipermail/core-libs-dev/2017-December/050560.html

On Jan 29, 2018, at 3:30 PM, Martin Buchholz  wrote:
> 
> 
> 
> On Mon, Jan 29, 2018 at 3:10 PM, Paul Sandoz  > wrote:
> 
> 
> > On Jan 29, 2018, at 1:02 PM, Martin Buchholz  > > wrote:
> >
> > I'm going to expedite this a little since we are building further changes
> > on top.
> >
> > RFR: jsr166 jdk integration 2018-02
> > http://cr.openjdk.java.net/~martin/webrevs/jdk/jsr166-integration/overview.html
> >  
> > 
> >
> 
> Looks ok, but i personally would not categorize the F/J changes as 
> miscellaneous :-)
> 
> Sorry, we're still working on forkjoin; only the ArrayList changes are ready 
> to go.



Re: [PATCH] Inefficient ArrayList.subList().toArray()

2018-01-25 Thread John Rose
On Jan 25, 2018, at 2:02 PM, Сергей Цыпанов  wrote:
> 
> +return (T[]) Arrays.copyOfRange(root.elementData, offset, 
> size, a.getClass());


Giving this a quick glance:
I think you may want s/size/offset+size/.
There should be calls to checkForComodification.



Re: [11] RFR JDK-8194554: filterArguments runs multiple filters in the wrong order

2018-01-17 Thread John Rose
>> That's a good idea (I should have started with a simpler test).
>> 
>> Updated:
>>   http://cr.openjdk.java.net/~mchung/jdk11/webrevs/8194554/webrev.01/ 
>> 


Thanks, Mandy and Paul.  Reviewed.  — John



Re: JDK-8145371 ClassCastException thrown in LambdaFormEditor.getInCache

2018-01-08 Thread John Rose
That looks good to me.  Vladimir Ivanov should take a look.
Christmas vacation is just ending in Russia, so he should be around soon.

On Jan 8, 2018, at 6:41 PM, Martin Buchholz  wrote:
> 
> No takers? ... let's try again.  Despite uniform failure to reproduce this 
> except under a debugger, let's apply the simple obviously correct fix, 
> namely: 
>  /** Craft a LambdaForm customized for this particular MethodHandle */
>  /*non-public*/
>  void customize() {
> +final LambdaForm form = this.form;
>  if (form.customized == null) {
>  LambdaForm newForm = form.customize(this);
>  updateForm(newForm);
> 
> 
> 
> On Wed, Jan 3, 2018 at 1:51 PM, Martin Buchholz  > wrote:
> Let's try again without mail bounces ...
> 
> 
> On Wed, Jan 3, 2018 at 1:40 PM, Martin Buchholz  > wrote:
> Here at Google we tried to fix JDK-8145371 because it looked like it was 
> causing rare problems in production.  But after looking at the code, I'm not 
> sure...  Anyways,
> 
> http://cr.openjdk.java.net/~martin/webrevs/jdk/MethodHandle.form/ 
> 
> https://bugs.openjdk.java.net/browse/JDK-8145371 
> 
> Copying to a local in customize() must be an improvement.
> The UNSAFE publication in updateForm looks fishy, as does that comment in 
> MethodHandleImpl.java.  Is there something the fullFence() is supposed to do 
> that isn't taken care of by putObjectVolatile ?  
> 
> Feel free to take ownership of this bug from me.
> 
> --- a/src/java.base/share/classes/java/lang/invoke/MethodHandle.java
> +++ b/src/java.base/share/classes/java/lang/invoke/MethodHandle.java
> @@ -1660,13 +1660,13 @@
>  assert(newForm.customized == null || newForm.customized == this);
>  if (form == newForm)  return;
>  newForm.prepare();  // as in MethodHandle.
> -UNSAFE.putObject(this, FORM_OFFSET, newForm);
> -UNSAFE.fullFence();
> +UNSAFE.putObjectVolatile(this, FORM_OFFSET, newForm);
>  }
>  
>  /** Craft a LambdaForm customized for this particular MethodHandle */
>  /*non-public*/
>  void customize() {
> +final LambdaForm form = this.form;
>  if (form.customized == null) {
>  LambdaForm newForm = form.customize(this);
>  updateForm(newForm);
> diff --git 
> a/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java 
> b/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java
> --- a/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java
> +++ b/src/java.base/share/classes/java/lang/invoke/MethodHandleImpl.java
> @@ -909,7 +909,7 @@
>  int c = count;
>  maybeCustomizeTarget();
>  if (c <= 1) {
> -// Try to limit number of updates. MethodHandle.updateForm() 
> doesn't guarantee LF update visibility.
> +// Try to limit number of updates.
>  if (isCounting) {
>  isCounting = false;
>  return true;
> 
> 
> 



Re: [10?] RFR: 8193128: Reduce number of implementation classes returned by List/Set/Map.of()

2017-12-22 Thread John Rose
On Dec 22, 2017, at 11:57 AM, John Rose <john.r.r...@oracle.com> wrote:
> 
> On Dec 21, 2017, at 5:31 PM, Stuart Marks <stuart.ma...@oracle.com> wrote:
>> 
>> I'd like a blanket assertion that view collections are not serializable.
>> 
>> Now this should be considered distinct from whether view collections are 
>> value-based; I'm fine with considering view collections to be value-based. 
>> It's just that some value-based collections are serializable and others 
>> aren't. In particular, given a serializable, value-based list, a sublist 
>> should not be serializable but it can be value-based. A sublist of a sublist 
>> should also not be serializable and can be value-based, and in this case we 
>> can do short-circuiting such as 'return this' if we like.
> 
> The two concerns can be separated, but they necessarily
> have a priority, and I think you have got the priority backward.
> Here's the way I see it:
> 
> 1. Value-based aggregates are inherently lossless when
> serialized, and so should (in general) be made serializable.
> 
> 2. Views are generally lossy when serialized, *if the backing
> store has a significant identity*, and should (in general) not
> be made serializable.
> 
> If we are going to make blanket statements, I claim the first
> is more fundamental than the second.  The second has
> priority only in history, since serialization was invented before
> the value-based distinction.
> 
> I think we should apply the rules in logical order 1, then 2,
> not historical order 2, then 1.
> 

P.S. I am talking about serialization as a general thing with
a future.  If you are talking only about legacy serialization,
then I can see there might be historical influences that would
invert the flow of the above logic.  Still, I'd like to tweak
legacy serialization in better directions AFAP, even if its
ultimately futile, so that when we come up with something
better we will have relatively more good precedents to
preserve.



Re: [10?] RFR: 8193128: Reduce number of implementation classes returned by List/Set/Map.of()

2017-12-22 Thread John Rose
On Dec 21, 2017, at 5:31 PM, Stuart Marks  wrote:
> 
> I'd like a blanket assertion that view collections are not serializable.
> 
> Now this should be considered distinct from whether view collections are 
> value-based; I'm fine with considering view collections to be value-based. 
> It's just that some value-based collections are serializable and others 
> aren't. In particular, given a serializable, value-based list, a sublist 
> should not be serializable but it can be value-based. A sublist of a sublist 
> should also not be serializable and can be value-based, and in this case we 
> can do short-circuiting such as 'return this' if we like.

The two concerns can be separated, but they necessarily
have a priority, and I think you have got the priority backward.
Here's the way I see it:

1. Value-based aggregates are inherently lossless when
serialized, and so should (in general) be made serializable.

2. Views are generally lossy when serialized, *if the backing
store has a significant identity*, and should (in general) not
be made serializable.

If we are going to make blanket statements, I claim the first
is more fundamental than the second.  The second has
priority only in history, since serialization was invented before
the value-based distinction.

I think we should apply the rules in logical order 1, then 2,
not historical order 2, then 1.



Re: RFR 8193832: Performance of InputStream.readAllBytes() could be improved

2017-12-20 Thread John Rose
On Dec 20, 2017, at 3:45 AM, Peter Levart  wrote:
> 
> allocation of ArrayList could be avoided. Imagine a use case where lots of 
> small files are read into byte[] arrays

+1 I was going to write a similar suggestion; thanks for sending it out.

These sorts of things often need to be designed to perform well at all
scales, not just large scales.

The new ArrayList(8) is dead weight if the data fits in the buffer.
So it's bad for scale < buffer length.

The earlier new ArrayList(128) is a noticeable overhead if the data
fits in two buffers.  So it's not so good for scale = M * (buffer length)
where M is about 2.

For a large scale result (M > 10), the footprint difference between
ArrayList(N) for various N is a second-order fraction.

— John

P.S. Road not taken:  Instead of new ArrayList(8) you could use
the default ArrayList constructor, and allocate it unconditionally.
It has a very low overhead if the ArrayList remains empty.  Using
that constructor could motivate you to simplify the code to use
the ArrayList unconditionally, since that would be just a single
heap node if it is not used to gather multiple results.  But I like
the "null" formulation despite its complexity.  So I'd probably
keep it the way Peter wrote it.



Re: Review Request JDK-8193325: StackFrameInfo::getByteCodeIndex returns wrong value if bci > 32767

2017-12-12 Thread John Rose
On Dec 12, 2017, at 5:24 PM, David Holmes  wrote:
> 
> as per getByteCodeIndex() it needs to be able to store "-1" to indicate 
> no-bci?

Yes, that's the essential requirement, to encode a "none" value.

FWIW, one natural way to do that here would be to make the int field
@Stable and have the accessor return one less than the stored field value.

Then, by the rules of @Stable, the field starts out at the logical value
of -1, and can transition to a valid BCI if patched to a non-zero value
in the range 1..(2^16).

Making it "final" and initialized to -1 is not too terrible either, although
that creates some tech. debt we need to discharge when upgrading
finals to be trustable.

— John

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-11 Thread John Rose
On Dec 11, 2017, at 6:00 PM, John Rose <john.r.r...@oracle.com> wrote:
> 
> class Arrays {
>  // people who want x.toArray can also use x.copy(Arrays::make):
>  static  T[] make(SequenceContent<T,T[]> content) { … }

Correction; that one needs an array type to know what to make:

static  T[] make(SequenceContent<T,T[]> content,
   Class arrayClass) { … }


> // ArrayList.copy uses this one internally as "myContent":
>  static  SequenceContent slice(T[] a, int start, int end) { … }
> }
> class List {
>  // people who want x.asUnmodifiableList can also use x.copy(List::make):
>  default SequenceCopier<T,List> make(SequenceContent<T,T[]> content) { … }
> 
>  // external access to a sized snapshot of a list:
>  default SequenceContent sizedContent() { … }
> }

Also, sizedContent is there for completeness but doesn't carry
as much weight as the other three.  It might be derived from
Collection::copy applied to the identity function, unless there
are scoping rules on the operand to Collection::copy.



Re: [10] RFR 8187254 Rearrange private MethodType constructors

2017-12-11 Thread John Rose
Good.

On Dec 11, 2017, at 2:02 PM, Paul Sandoz  wrote:
> 
> Please review this small fix to collapse the two private constructors of 
> MethodType into one. This consolidates all the copying/validation logic into 
> the factory method makeImpl.



Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

2017-12-11 Thread John Rose
On Dec 7, 2017, at 5:03 PM, Martin Buchholz  wrote:
> 
> (I'm still trying to love this new API)
> 
> The changes to the jsr166 tck are fine.
> 
> I'm not convinced that the new implementation for ArrayList is progress.
> The current implementation of toArray(T[]) does
> 
>// Make a new array of a's runtime type, but my contents:
>return (T[]) Arrays.copyOf(elementData, size, a.getClass());
> 
> which does not have to pay the cost of zeroing the array, so is potentially
> faster.  (depends on whether the VM provides cheap pre-zeroed memory?! what
> does jmh say?)

The VM does not do pre-zeroing.  We've done experiments over the years
with that and it has never beaten pre-garbaged memory, with zero filling
at creation time.  The reason is that we can vectorize the zero filling at
any time *and* omit zeroing at all in some important cases like copyOf.
OTOH, pre-zeroing assumes there is background bandwidth available
for pre-processing cache lines that are not yet in use, which isn't always
true; sometimes the mutator needs all the cache bandwidth.  All of which
is to say that the VM is likely *never* to do pre-zeroing, unless there is
some special hardware feature that does so, in bulk, in main memory.
Which AFAIK doesn't exist today.

This particular set of performance problems is (IMO) sensitive to the
number of passes over the input and output arrays, as well as over
any intermediate arrays.  Zeroing isn't a separate pass in the case
of copyOf, but in any two phase array creation protocol, zeroing *is*
likely to require a separate pass.  What do I mean by "two phase"?
I mean a scheme where in phase one an empty array is created,
and in phase two the array is filled with content.  If the two phases
are not reliably juxtaposed (as they are in copyOf) the VM must
conservatively fill the empty array with zeros.  Different code shapes
have (inherently) different inlining characteristics.  In the case of
toArray(IntFunction), three things must happen to drop the zeroing
pass (as in Arrays.copyOf):  First, the particular IntFunction must
fully inline (that's phase one), and second, the (phase two) loop
that populates the new array must be juxtaposed with the inlined
code of the IntFunction call, without intermediate logic that might
discourage the optimizer.  Third, both code shapes must use
intrinsics that are recognized by the optimizer.  These three
conditions are carefully engineered in the body of Arrays.copyOf,
but appear "emergently" ("accidentally") via the toArray(IF) API.
So the optimization isn't impossible, but it is subject to more risk,
and/or will require more optimizer work to support that code shape.

Bottom line:  Arrays.copyOf is not as arbitrary as one may think;
it is designed that way because it is easy to optimize.  And it covers
two interesting use cases in the present discussion, which are
(a) copy my internal array wholesale, and (b) make me a fresh
array from a zero-length exemplar.

> If we're not getting type safety and not getting significantly better
> performance, all we have is a more natural API.  But toArray(IntFunction)
> also seems not very natural to me,

Whether or not it bears on the present API discussion, I think it
would be fruitful to step back and ask ourselves the big picture
question here, "What is the natural shape of an array factory?"

I submit to you that such a factory is *not* an IntFunction, because
that only creates half of an array (or 0.01% of one), the empty
version that needs to be populated.  A natural array factory API
will provide two pieces of data:  First a size (an accurate, not a
provisional one like List.size), and second a series of values to
put in the array.  *That* structure is the natural, might I say
categorical, argument to an array factory.  Also a List factory.
As you can probably tell, I'm taking the value-based view of
things here.

(Proof by generalization:  If/when we eventually create frozen
arrays, and flattened arrays, and volatile arrays, and hybrid
instance-with-sized-tail arrays, it will not always be an option
to create the empty array in a first phase, and then fill it from
the outside.  Frozen arrays are the obvious counterexample,
but any immutable data structure will do, as will a private
object tail, even if mutable.)

The interface would look something like one of these:

interface ArrayContent { int size(); T get(int i); }
interface ArrayContent { int size(); Iterator iterator(); }
interface ArrayContent { int count(); Stream stream(); }
interface ArrayContent { Spliterator elements(); } //must be SIZED

That last suggests that we don't need a new API at all,
that perhaps the natural input to an array factory is just
a Spliterator with the SIZED characteristic, and a range
check on the estimateSize() result.  Or maybe a Stream.

There are obvious bootstrapping issues here, but I don't
think they are complex, because the source object can
create a tiny Stream or Spliterator 

Re: [10?] RFR: 8193128: Reduce number of implementation classes returned by List/Set/Map.of()

2017-12-08 Thread John Rose
On Dec 8, 2017, at 6:19 PM, Claes Redestad  wrote:
> 
> I think one can interpret the @implSpec in AbstracList::subList to suggest
> that the implementation retrieved will subclass AbstractList, which implies
> it's *not* Serializable. As we move away from AbstractList then we can of
> course choose our own spec, but since it's established behavior I tried as
> best I could to retrofit behavior…

Maybe you are right about that, but AbstractList is *not* part of the API
of the List returned by List.of, so you are free to do whatever is right
for the value-based list, ignoring all parts of AbstractList that are not
immediately valuable to you.

(Yes, someone could "notice" that the result of List.of seems to be
castable to AbstractList, and *then* lodge a complaint that its sublist
is serializable, but that's a forced scenario; if we deem it is a real
bug to fix, at some date far in the future, we can fix it by removing
AbstractList from the supers of the List.of result.)

> As time is likely up for getting this into 10 then I guess we might as well
> take a step back and do this right for 11. I suspect we'll need a CSR for
> this RFE regardless.

No, we don't need a CSR.  It's unspecified behavior.  And in a brand new
API.  This is a valid refinement from murky to clear, not a change in spec.
(Because it's brand new, there's not even a de facto spec. to worry about.)

> Keeping it all down to two classes as you suggest allows any mixed usage to
> stay in the realm of bimorphism which might bring a considerable speedup
> in some cases, and the reduction in number of implementation classes
> loaded is a boon. Having all methods in ListN account for the offset might
> cost us a few percent here and there, though.

I don't think it will.  It's a fetch of an addend (usually zero) from a cache 
line
that is already hot.  Hot fetches and extra addends are usually in the noise.

> An int offset on ListN would actually be footprint neutral compared to the
> pre-existing implementation which pulls in the int modCount from AbstractList.

Yep.  And it will be an improvement to the number of classes loaded, both in
the status quo and in your current version (with bespoke sublists and their
iterators).

— John

P.S.  The extra cut-n-paste of the helper classes was bothering me, enough
to consider asking for new API points for sublist views.  But if you do the
ListN with an offset, those problems can be pushed off for another day.

Re: [10?] RFR: 8193128: Reduce number of implementation classes returned by List/Set/Map.of()

2017-12-08 Thread John Rose
On Dec 8, 2017, at 4:45 PM, John Rose <john.r.r...@oracle.com> wrote:
> 
> Can anyone point out a reason why the value based
> lists of List.of() should serialize while the value based
> lists of List.of().subList() should not?  Or is there some
> reason we should not allow subList to produce value
> based lists?

This gets to the heart of a question about how thoroughly we are going
to adopt the concept of value-based.  I think we want to accept that a
sub-collection derived from a value-based collection is itself
value-based, whether or not the API point (designed *before* value
based data structures) is supposed to deliver a “view” or not.  I.e.,
a view of a value-based collection is indistinguishable from a
non-view, and there are performance costs to maintaining a pretended
distinction, so let’s tune our spec. to avoid those costs.

If I'm right about this, perhaps the most natural way to balance Claes's
design is to add an offset field to ListN, and allow ListN to project arbitrary
slices out of a backing array.

Then we have only two classes (ListN, List12) even when folks are
slicing all around their lists.  That strikes me as (in terms of number
of loaded classes) much more economical than the stuff with a new
bespoke SubList class, with it's own new bespoke iterator class.
And I think the extra int field (in ListN) will melt into the noise.

— John

Re: [10?] RFR: 8193128: Reduce number of implementation classes returned by List/Set/Map.of()

2017-12-08 Thread John Rose
+1 (there should)

On Dec 8, 2017, at 9:44 AM, Martin Buchholz  wrote:
> 
> Should there be changes to general purpose collection testing classes like
> test/jdk/java/util/concurrent/tck/CollectionTest.java
> test/jdk/java/util/Collection/MOAT.java
> that would have caught this bug?
> (although those are mostly designed for mutable collections, but I think
> some immutable collections (nCopies?) get tested somewhere.




  1   2   3   4   >