[ 
https://issues.apache.org/jira/browse/JENA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15848488#comment-15848488
 ] 

ASF GitHub Bot commented on JENA-1285:
--------------------------------------

Github user ajs6f commented on a diff in the pull request:

    https://github.com/apache/jena/pull/213#discussion_r98915106
  
    --- Diff: 
jena-arq/src/main/java/org/apache/jena/riot/tokens/StringType.java ---
    @@ -0,0 +1,22 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.jena.riot.tokens;
    +
    +/** Seen form of a {@link TokenType#STRING} */
    +public enum StringType { STRING1, STRING2, LONG_STRING1, LONG_STRING2 }
    --- End diff --
    
    It would be nice to have a quick comment explaining what the actual 
difference amongst these guys is.


> Have on Tokenizer token for strings.
> ------------------------------------
>
>                 Key: JENA-1285
>                 URL: https://issues.apache.org/jira/browse/JENA-1285
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: RIOT
>            Reporter: Andy Seaborne
>            Assignee: Andy Seaborne
>            Priority: Minor
>
> The Tokenizer ({{TokenizerText}}) faithfully records what sort of string it 
> has processed using different token types - STRING1, STRING2, LONG_STRING1, 
> LONG_STRING2.
> Sometimes it matters (N-Triples), sometimes it doesn't (Turtle).
> [Turtle rule for 
> strings|https://www.w3.org/TR/turtle/#grammar-production-String]
> [N-Triples rule for 
> strings|https://www.w3.org/TR/n-triples/#grammar-production-STRING_LITERAL_QUOTE]
> Instead of 4 tokens, (5 if you include the existing STRING token) it is 
> proposed to use one token type STRING and record the actual string type seen 
> separately.
> This is make working with non-text formats simpler where there are strings 
> without the concept of quotes, and any format that works with any string form.
> The specific cases (e.g. N-Triples) can still test for the details of the 
> string syntax seen but the token type is the conceptual "superclass" STRING 
> type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to