Re: RFR: 8329829: HttpClient: Add a BodyPublishers.ofFileChannel method [v3]

Volkan Yazici Fri, 18 Jul 2025 02:45:10 -0700

On Thu, 17 Jul 2025 13:18:10 GMT, Jaikiran Pai <j...@openjdk.org> wrote:


>> Volkan Yazici has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Improve docs on `IndexOutOfBoundsException` thrown
>>   
>>   Co-authored-by: Daniel Fuchs <67001856+df...@users.noreply.github.com>
>
> src/java.net.http/share/classes/java/net/http/HttpRequest.java line 746:
> 
>> 744:          */
>> 745:         public static BodyPublisher ofFileChannel(FileChannel channel, 
>> long offset, long length) {
>> 746:             Objects.requireNonNull(channel, "channel");
> 
> The method's javadoc should have a `@throws NullPointerException if channel 
> is null`. Unless that's covered by a wider statement about null parameters 
> some place else? I couldn't find it in this class.

Implemented in 8f66a32ec8b.

> src/java.net.http/share/classes/java/net/http/HttpRequest.java line 747:
> 
>> 745:         public static BodyPublisher ofFileChannel(FileChannel channel, 
>> long offset, long length) {
>> 746:             Objects.requireNonNull(channel, "channel");
>> 747:             return new RequestPublishers.FileChannelPublisher(channel, 
>> offset, length);
> 
> Hello Volkan, I see that the constructor of `FileChannelPublisher` attempts 
> to determine the file size of the channel and if an `IOException` is thrown 
> then it gets propagated as  `UncheckedIOException`. I think an alternate 
> approach and perhaps a better one would be to specify in this new method's 
> documentation that the `FileChannel`'s size will be first determined by this 
> method to verify that the given `offset` and the `length` are within bounds 
> of that size. That then allows us to add a throws clause to this method which 
> we merely propagate from the call of `FileChannel.size()`.

Changed as suggested in 8f66a32ec8b.

> src/java.net.http/share/classes/jdk/internal/net/http/RequestPublishers.java 
> line 433:
> 
>> 431:             this.channel = Objects.requireNonNull(channel, "channel");
>> 432:             long fileSize = fileSize(channel);
>> 433:             Objects.checkFromIndexSize(offset, length, fileSize);
> 
> My preference would be to move these file size determination and the 
> offset/length bounds check to the call site of this constructor. That way 
> it's much more visible at the place where this behaviour is specified (by the 
> public API).

My motivation for keeping these checks at the `FileChannelPublisher::new` was

1. `FilePublisher` in the same file follows this convention too
2. Even though `jdk.internal.net.http` is not exported, the class is `public`. 
I wanted to ensure `FileChannelPublisher::new` doesn't accept invalid input

@jaikiran, given these,

1. Do you still prefer me moving the checks to the `ofFileChannel`?
2. If yes, do you also want me to duplicate the checks in 
`FileChannelPublisher::new` too?

> src/java.net.http/share/classes/jdk/internal/net/http/RequestPublishers.java 
> line 476:
> 
>> 474: 
>> 475:         @Override
>> 476:         public synchronized boolean hasNext() {
> 
> In context of the synchronizations in this class, does a single instance of 
> `FileChannelIterator` gets accessed concurrently? I haven't fully grasped the 
> publisher/subscriber model through which this `FileChannelIterator` instance 
> would be handed out and how that then ends up as a `BodyPublisher` that's 
> accessed by the application.

[Reactive Streams Specification for 
JVM](https://github.com/reactive-streams/reactive-streams-jvm) states the 
following:

> #### Publisher
> ...
> `onSubscribe`, `onNext`, `onError` and `onComplete` signaled to a 
> `Subscriber` MUST be signaled _serially_
> ...
> #### Subscriber
> ...
> A `Subscriber` MUST ensure that all calls on its `Subscription`'s request and 
> cancel methods are performed _serially_

Thus we indeed don't need synchronization in our publishers and subscribers. In 
fact, I explicitly investigated this concern for #23096, where we eventually 
dropped the synchronization for `LimitingSubscriber`, and this approach was 
blessed by @viktorklang-ora back then. The reason I opted for synchronization 
in `FileChannelIterator` was:

1. `StreamIterator` (located in the same class, and used for `ofFile()`) was 
synchronized too and I wanted to stick to the present convention
2. `PullPublisher` (which `StreamIterator`, `FileChannelIterator`, etc. 
instances are passed to _per subscription_) operates using a 
`SequentialScheduler`, and the _scheduler_ in the name made me more skeptical 
about concurrency implications

But giving this a second thought, I agree that we should not implement 
redundant synchronization. I've removed synchronization for 
`FileChannelIterator` in b114fe9598e.

> src/java.net.http/share/classes/jdk/internal/net/http/RequestPublishers.java 
> line 490:
> 
>> 488:                 int readLength = channel.read(buffer, position);
>> 489:                 // Short-circuit if `read()` has failed, e.g., due to 
>> file content being changed in the meantime
>> 490:                 if (readLength < 0) {
> 
> The `FileChannel.read(ByteBuffer, position)` API states:
> 
>> If the given position is greater than or equal to the file's current size 
>> then no bytes are read.
> 
> I think we might have to consider the situation where the underlying file's 
> size has reduced and this position is now past that size and the `read()` 
> call returning 0. Would that then end up in a situation where we have a 
> "never ending" BodyPublisher, which publishes nothing but hasn't terminated 
> either?

The `FileChannel.read(ByteBuffer, position)` API further states the following:

> [Method returns] The number of bytes read, possibly zero, or `-1` if the 
> given position is greater than or equal to the file's current size

That is, the case you're describing is indicated by the `-1` return value, 
which we check in the next line. Plus, we already have a test for precisely the 
scenario that you're describing: `testFileModificationDuringPublisherRead`.

> src/java.net.http/share/classes/jdk/internal/net/http/RequestPublishers.java 
> line 492:
> 
>> 490:                 if (readLength < 0) {
>> 491:                     // We *must* throw to signal that the request needs 
>> to be cancelled.
>> 492:                     // Otherwise, the server will continue waiting data.
> 
> The comment itself is a good idea but i think we should leave out the 
> reference to a "server" from this comment. Perhaps something like: "Throw to 
> signal that the request needs to be cancelled"?

Changed as suggested in 183da9d0235.

> src/java.net.http/share/classes/jdk/internal/net/http/common/Utils.java line 
> 372:
> 
>> 370: 
>> 371:     /**
>> 372:      * {@return a new {@link ByteBuffer} instance of configured 
>> capacity for the HTTP Client}
> 
> I think instead of saying "configured capacity" we should say "instance of 
> {@link BUFSIZE} ...". This is an internal class so it should be OK to link to 
> this constant even if the constant itself isn't documented. Linking to 
> `BUFSIZE` makes it clear and easier to find what the actual capacity is.

Changed as requested in eacd5da23bb.

> src/java.net.http/share/classes/jdk/internal/net/http/common/Utils.java line 
> 384:
> 
>> 382:      *
>> 383:      * @param maxCapacity a buffer capacity, in bytes
>> 384:      * @throws IllegalArgumentException if {@code capacity < 0}
> 
> Typo - should have been `if {@code maxCapacity < 0}`

Fixed in da69840639e.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215596609
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215596735
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215596298
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215596074
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215595182
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215595719
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215594742
PR Review Comment: https://git.openjdk.org/jdk/pull/26155#discussion_r2215594608

Re: RFR: 8329829: HttpClient: Add a BodyPublishers.ofFileChannel method [v3]

Reply via email to