[jira] [Commented] (JENA-1584) Include a Javacc based Turtle parser in RIOT

ASF GitHub Bot (JIRA) Sun, 05 Aug 2018 06:30:30 -0700


    [ 
https://issues.apache.org/jira/browse/JENA-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569464#comment-16569464
 ]


ASF GitHub Bot commented on JENA-1584:
--------------------------------------

Github user kinow commented on a diff in the pull request:

    https://github.com/apache/jena/pull/455#discussion_r207735233
  
    --- Diff: 
jena-arq/src/main/java/org/apache/jena/riot/lang/extra/javacc/ParseException.java
 ---
    @@ -0,0 +1,205 @@
    +/* Generated By:JavaCC: Do not edit this line. ParseException.java Version 
6.0 */
    +/* JavaCCOptions:KEEP_LINE_COL=null */
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.jena.riot.lang.extra.javacc;
    +
    +/**
    + * This exception is thrown when parse errors are encountered.
    + * You can explicitly create objects of this exception type by
    + * calling the method generateParseException in the generated
    + * parser.
    + *
    + * You can modify this class to customize your error reporting
    + * mechanisms so long as you retain the public fields.
    + */
    +public class ParseException extends Exception {
    +
    +  /**
    +   * The version identifier for this Serializable class.
    +   * Increment only if the <i>serialized</i> form of the
    +   * class changes.
    +   */
    +  private static final long serialVersionUID = 1L;
    +
    +  /**
    +   * This constructor is used by the method "generateParseException"
    +   * in the generated parser.  Calling this constructor generates
    +   * a new object of this type with the fields "currentToken",
    +   * "expectedTokenSequences", and "tokenImage" set.
    +   */
    +  public ParseException(Token currentTokenVal,
    +                        int[][] expectedTokenSequencesVal,
    +                        String[] tokenImageVal
    +                       )
    +  {
    +    super(initialise(currentTokenVal, expectedTokenSequencesVal, 
tokenImageVal));
    +    currentToken = currentTokenVal;
    +    expectedTokenSequences = expectedTokenSequencesVal;
    +    tokenImage = tokenImageVal;
    +  }
    +
    +  /**
    +   * The following constructors are for use by you for whatever
    +   * purpose you can think of.  Constructing the exception in this
    +   * manner makes the exception behave in the normal way - i.e., as
    +   * documented in the class "Throwable".  The fields "errorToken",
    +   * "expectedTokenSequences", and "tokenImage" do not contain
    +   * relevant information.  The JavaCC generated code does not use
    +   * these constructors.
    +   */
    +
    +  public ParseException() {
    +    super();
    +  }
    +
    +  /** Constructor with message. */
    +  public ParseException(String message) {
    +    super(message);
    +  }
    +
    +
    +  /**
    +   * This is the last token that has been consumed successfully.  If
    +   * this object has been created due to a parse error, the token
    +   * followng this token will (therefore) be the first error token.
    +   */
    +  public Token currentToken;
    +
    +  /**
    +   * Each entry in this array is an array of integers.  Each array
    +   * of integers represents a sequence of tokens (by their ordinal
    +   * values) that is expected at this point of the parse.
    +   */
    +  public int[][] expectedTokenSequences;
    +
    +  /**
    +   * This is a reference to the "tokenImage" array of the generated
    +   * parser within which the parse error occurred.  This array is
    +   * defined in the generated ...Constants interface.
    +   */
    +  public String[] tokenImage;
    +
    +  /**
    +   * It uses "currentToken" and "expectedTokenSequences" to generate a 
parse
    +   * error message and returns it.  If this object has been created
    +   * due to a parse error, and you do not catch it (it gets thrown
    +   * from the parser) the correct error message
    +   * gets displayed.
    +   */
    +  private static String initialise(Token currentToken,
    +                           int[][] expectedTokenSequences,
    +                           String[] tokenImage) {
    +    String eol = System.getProperty("line.separator", "\n");
    +    StringBuffer expected = new StringBuffer();
    +    int maxSize = 0;
    +    for (int i = 0; i < expectedTokenSequences.length; i++) {
    +      if (maxSize < expectedTokenSequences[i].length) {
    +        maxSize = expectedTokenSequences[i].length;
    +      }
    +      for (int j = 0; j < expectedTokenSequences[i].length; j++) {
    +        expected.append(tokenImage[expectedTokenSequences[i][j]]).append(' 
');
    +      }
    +      if (expectedTokenSequences[i][expectedTokenSequences[i].length - 1] 
!= 0) {
    +        expected.append("...");
    +      }
    +      expected.append(eol).append("    ");
    +    }
    +    String retval = "Encountered \"";
    +    Token tok = currentToken.next;
    +    for (int i = 0; i < maxSize; i++) {
    +      if (i != 0) retval += " ";
    +      if (tok.kind == 0) {
    +        retval += tokenImage[0];
    +        break;
    +      }
    +      retval += " " + tokenImage[tok.kind];
    +      retval += " \"";
    +      retval += add_escapes(tok.image);
    +      retval += " \"";
    +      tok = tok.next;
    +    }
    +    retval += "\" at line " + currentToken.next.beginLine + ", column " + 
currentToken.next.beginColumn;
    +    retval += "." + eol;
    +    if (expectedTokenSequences.length == 1) {
    +      retval += "Was expecting:" + eol + "    ";
    +    } else {
    +      retval += "Was expecting one of:" + eol + "    ";
    +    }
    +    retval += expected.toString();
    +    return retval;
    +  }
    +
    +  /**
    +   * The end of line string for this machine.
    +   */
    +  protected String eol = System.getProperty("line.separator", "\n");
    +
    +  /**
    +   * Used to convert raw characters to their escaped version
    +   * when these raw version cannot be used as part of an ASCII
    +   * string literal.
    +   */
    +  static String add_escapes(String str) {
    +      StringBuffer retval = new StringBuffer();
    --- End diff --
    
    See comment above about `StringBuffer` & `StringBuilder`...


> Include a Javacc based Turtle parser in RIOT
> --------------------------------------------
>
>                 Key: JENA-1584
>                 URL: https://issues.apache.org/jira/browse/JENA-1584
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: RIOT
>    Affects Versions: Jena 3.8.0
>            Reporter: Andy Seaborne
>            Assignee: Andy Seaborne
>            Priority: Minor
>             Fix For: Jena 3.9.0
>
>
> Turtle is the basis for some additional languages (RDF*, SHACL and ShEX 
> compact forms).
> The main RIOT Turtle parser is written for speed, with the tuned tokenizer 
> and directly written java grammar parser. This makes it harder to reuse and 
> extend.
> This ticket proposes including another RDF 1.1 compliant Turtle parser based 
> on JavaCC to provide an easier route for additional languages by providing 
> all the details of Turtle such as the tokens and prefix name handling, in a 
> form more suitable as a base for the new language. It will still be by being 
> a copy of the parser, system, not class inheritance.)
> RDF 1.1 Turtle and SPARQL 1.1 were aligned by the working groups and share 
> tokens and several grammar rules.
> This would not be active by default (i.e. not a registered {{Lang}} and it's 
> parser factory but registered by automatic initialization). It's test suite 
> would be run in the build and pass the RDF 1.1 Turtle test suite.
>  
> There is non-RDF1.1 Javacc Turtle parser in jena-core is based on the 
> pre-RDF1.1 state of Turtle. It is sufficient for the assembler tests that 
> read turtle files. It could be moved into the test area except there appear 
> to be some legacy applications that only use jena-core.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (JENA-1584) Include a Javacc based Turtle parser in RIOT

Reply via email to