On 30/07/16 15:43, Martynas Jusevičius wrote:
Andy,

why does ErrorHandlerStd throw RiotException?
https://github.com/apache/jena/blob/master/jena-arq/src/main/java/org/apache/jena/riot/system/ErrorHandlerFactory.java#L128

It fails to pass on URI violations that way, because RiotException is
too general. I guess this is the outcome of the ErrorHandler interface
methods.

Yes.  ErrorHandler is general.

Turns out there is already IRIException that captures the violations
(but unfortunately does not extend RiotException):

History but also not tying independent subsystems together into a big dependency tangle.

https://github.com/apache/jena/blob/master/jena-iri/src/main/java/org/apache/jena/iri/IRIException.java#L43

Couldnt ErrorHandler (re)use the IRIException? Maybe the interface
should be refactored based on logging (Riot)Exceptions instead of
String messages:

The parsers call the error handler when they find a structural parse error - that's what gives the line and column number and also means the error message is hand-crafted to the specific error. Some are warnings - the parsing continues.

The role of CheckerIRI.iriViolations is to convert IRI problems into the ErrorHandler framework. It can't provide line/column as it's too late - the ParserProfile can do so and can call an ErrorHandler.

For more precise control, that is probably the way to go for Jena 3.1.0 (3.0.1). RDF/XML is different because the parser is not RIOT based - it's Jena original ARP parser in an adapter. The adapter does extract line and column information when it can. Maybe being liberal with IRI creation and checking in a specific NodeChecker for the RDFStream driven checking process might be better for you.

This is a general framework for the parsing process - there is only so much the normal components provided for the framework can do. Eventually, precise control needs components to capture the specific requirements.

org.apache.jena.riot.system.IRIResolver does throw RiotExceptions.

Adding Errorhandler.errorIRI(IRIViolation, line, col) and Errorhandler.warningIRI(IRIViolation, line, col) is possible with investigation - need to go back through the parsers (just ParserProfilebase and LangRDFXML ??) to make sure it is actually going to work as required. The parsing lifecycle may not fit it.

        Andy


  public class IRIException extends RiotException
  {

    public final Violation violation;

    public IRIException(Violation violation)
    {
      this.violation = violation;
    }

    public Violation getViolation()
    {
      return violation;
    }

  }

  public interface ErrorHandler
  {

    public void exception(RiotException rex, int level, long line, long col);

  }

  public class CheckerIRI implements NodeChecker
  {

    public static void iriViolations(IRI iri, ErrorHandler errorHandler, ...)
    {
      ...

      Violation v = iter.next();
      ...
      if ( isError )
      // IRI errors are warning at the level of parsing - they got
through syntax checks.
      errorHandler.exception(new IRIException(v), Level.WARNING, line, col);
      ...
    }

  }

After this, IRI violations could be analyzed by ErrorHandler:

public class ErrorHandlerCustom implements ErrorHandler
{

  public void exception(RiotException rex, int level, long line, long col)
  {
    if (rex instanceof IRIException)
    {
      IRIException iex = (IRIException)rex;
      Violation v = iex.getViolation();
      ...
      if (level == Level.ERROR) throw rex; // throw exception like
error() does now
    }
    ...
  }

}

On Fri, Jul 8, 2016 at 11:28 PM, Andy Seaborne <a...@apache.org> wrote:
On 08/07/16 18:57, A. Soroka wrote:

This may or may not be close to what you are looking for, but you might
try something like oaj.riot.RDFDataMgr::parse with a wrapper around
oaj.riot.system.StreamRDFLib::graph. You can subclass
oaj.riot.system.StreamRDFWrapper for that. Then you have tuple-level control
over the process and can throw exceptions or execute side-effects as
desired.

---
A. Soroka
The University of Virginia Library

On Jul 8, 2016, at 1:46 PM, Martynas Jusevičius <marty...@graphity.org>
wrote:

Hey,

I have implemented an RDF/POST parser which extends ReaderRIOTBase:

https://github.com/AtomGraph/Core/blob/master/src/main/java/org/graphity/core/riot/lang/RDFPostReader.java

It silently accepts broken URIs, so I want the behavior to depend on
ParserProfile, and throw exceptions in case of strict error handler.

I'm reading Model using Model.read(InputStream in, String base, String
lang), so the question is: how do I set the (strict) error handler
before reading it?


Model.read does not expose

See ExRIOT_2 which sets the ErrorHandler.

https://github.com/apache/jena/blob/master/jena-arq/src-examples/arq/examples/riot/ExRIOT_2.java

Create a ReaderRIOT, set the error handler.

See also:
ErrorHandlerFactory.setDefaultErrorHandler

Or, for specifically IRIs, you might want to use ParserProfileBase or a
derivative rather than ParserProfileChecker



Also, what is the difference/connection between ErrorHandler and
RDFErrorHandler?

https://jena.apache.org/documentation/javadoc/arq/org/apache/jena/riot/system/ErrorHandler.html

https://jena.apache.org/documentation/javadoc/jena/org/apache/jena/rdf/model/RDFErrorHandler.html


RDFErrorHandler is used the ARP, for RDF/XML parsing.

It is adapted to ErrorHandler by RIOT.


Thanks,

Martynas
atomgraph.com




Reply via email to