[
https://issues.apache.org/jira/browse/DAFFODIL-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17330410#comment-17330410
]
Steve Lawrence commented on DAFFODIL-2502:
------------------------------------------
My reading of the InputStream API is that it should never return zero bytes.
Does a TCP Socket InputStream do that?
As a workaround that could be impelmented without modifing Daffodil, one could
implement a custom InputStream that wraps the TCP/Socket connection. When
Daffodil calls read, the custom InputStream attempts to read from the socket.
If it gets any bytes then it just returns them to Daffodil for normal
processing. But if it gets zero bytes, it can wait a custom timeout controlled
by this custom InputSTream. If the socket still continues to get zero bytes
after so many timeouts, or the connection is closed, or whatever errors the
user wnats to handle, then it can throw an IOException which should cause the
parse to fail (untested), or end of data which let's the parse end, potentially
with success.
> Parse must behave properly for reading data from TCP sockets
> ------------------------------------------------------------
>
> Key: DAFFODIL-2502
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2502
> Project: Daffodil
> Issue Type: Bug
> Components: API, Back End
> Affects Versions: 3.0.0
> Reporter: Mike Beckerle
> Assignee: Mike Beckerle
> Priority: Major
>
> Daffodil assumes the input streams are like files - reads are always blocking
> for either 1 or more bytes of data, or End-of-data.
> People want to use Daffodil to read data from TCP/IP sockets. These can
> return 0 bytes from a read because there is no data available, but that does
> NOT mean the end of data. It's just a temporary condition. More data may come
> along.
> Daffodil's InputSourceDataInputStream is wrapped around a regular Java input
> stream, and enables us to support incoming messages which do not conform to
> byte-boundaries.
> The problem is that there's no way for users to wrap an
> InputSourceDataInputStream around a TCP/IP socket, and have it behave
> properly when a read() call temporarily says 0 bytes available.
> Obviously we don't want to sit in a tight loop just retrying the read until
> we get either some bytes or end-of-data.
> The right API here is that if the read() of the underlying java stream
> returns 0 bytes, that a hook function supplied by the API user is called.
> One obvious thing a user can do is put a call to Thread.yield() in the hook.
> (That might even want to be the default behavior if they supply no hook.)
> Then if they have a separate thread parsing the data with daffodil, that
> thread will at least yield the CPU, i.e., behave politely in a multi-threaded
> world.
> More advanced usage could start a Daffodil parse using co-routines, returning
> control to the caller when the parse must pause due to read() of the Java
> input stream returning 0 bytes.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)