[ 
https://issues.apache.org/jira/browse/DAFFODIL-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330469#comment-17330469
 ] 

Mike Beckerle commented on DAFFODIL-2502:
-----------------------------------------

Strange..... Last night I know I found javadoc saying DataInputStream could 
return 0 bytes, and not just in the degenerate case where the passed in length 
is 0.  It was an "aha" moment.

However, I'm not finding that again, right now. So perhaps there is no such 
problem with the signatures or it was non-authoritative javadoc.

I will determine if the user is doing non-blocking read using Channels (which 
do have a way to support non-blocking behavior via selectors), it's possible 
they have a channel in non-blocking mode, but are wrapping it into an input 
stream-like class that doesn't actually obey the contract of read.

Evidence for InputStreams always being blocking.... Both Java Socket API and 
Apache Commons HttpConnect are careful so if you create an InputStream from a 
Channel in non-blocking mode it throws an exception. That prevents the 
InputStream from accidently becoming a non-blocking read thing violating the 
contract of read(buf, off, len).

So that's encouraging. The problem may have nothing to do with the read(buf, 
off, len) method actually.

It's also possible that the problem here is in the Daffodil isAtEnd method. 
E.g., a bug having to do with the available() method returning 0, and daffodil 
code in the I/O layer mistakenly assuming this means the stream is ended. The 
user suggested they observed something like this, but I have yet to reproduce 
it and did not spot it in the I/O layer code offhand.

> Parse must behave properly for reading data from TCP sockets
> ------------------------------------------------------------
>
>                 Key: DAFFODIL-2502
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2502
>             Project: Daffodil
>          Issue Type: Bug
>          Components: API, Back End
>    Affects Versions: 3.0.0
>            Reporter: Mike Beckerle
>            Assignee: Mike Beckerle
>            Priority: Major
>
> Daffodil assumes the input streams are like files - reads are always blocking 
> for either 1 or more bytes of data, or End-of-data.
> People want to use Daffodil to read data from TCP/IP sockets. These can 
> return 0 bytes from a read because there is no data available, but that does 
> NOT mean the end of data. It's just a temporary condition. More data may come 
> along.
> Daffodil's InputSourceDataInputStream is wrapped around a regular Java input 
> stream, and enables us to support incoming messages which do not conform to 
> byte-boundaries.
> The problem is that there's no way for users to wrap an 
> InputSourceDataInputStream around a TCP/IP socket, and have it behave 
> properly when a read() call temporarily says 0 bytes available.
> Obviously we don't want to sit in a tight loop just retrying the read until 
> we get either some bytes or end-of-data.
> The right API here is that if the read() of the underlying java stream 
> returns 0 bytes, that a hook function supplied by the API user is called.
> One obvious thing a user can do is put a call to Thread.yield() in the hook. 
> (That might even want to be the default behavior if they supply no hook.) 
> Then if they have a separate thread parsing the data with daffodil, that 
> thread will at least yield the CPU, i.e., behave politely in a multi-threaded 
> world.
> More advanced usage could start a Daffodil parse using co-routines, returning 
> control to the caller when the parse must pause due to read() of the Java 
> input stream returning 0 bytes.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to