afs commented on PR #2408:
URL: https://github.com/apache/jena/pull/2408#issuecomment-2320802444

   A `TX`-`TC` pair would give the framing mentioned by the Thrift developers. 
   Comment: https://lists.apache.org/thread/yd8wnvrg7ltg6szl09kk7fpoglxtvkv1
   
   If there is any fault in the stream, then `TC` isn't seen and the processor 
should abort. This is robust to any violation of the expected input. This is 
also good for detecting data after the `TC` e.g. concatenated legal syntax but 
an unexpected condition. There are other conditions like data before `TX`, or 
no `TX`  ("TX. junk" has a different error message), as is the case for 
non-patch input.
   
   Non-patch input leads to an empty patch. One possibility is to add 
`RDFPatch.isEmpty` which would cover the case of no `TX`
   
   Inspecting the stack is cause the exception to materialize the stack.
   
   There are other sources of information - if the transport is from a string, 
it's not truncated. A string is either good or bad.
   An input stream from a network can be truncated.
   
   The Protobuf form may be more robust. It is slightly slower than Thrift. 
Thrift parses RDF at up to 1 million triples/s, Protobuf is slower by ~5% 
because it adds an object wrapper around the unit in the input stream which may 
give better error visibility.
   This one extra Java object looks like it is the cause of the 5% but we're 
into the sub-microsecond area here so quite possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to