[
https://issues.apache.org/jira/browse/JENA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857942#comment-15857942
]
ASF subversion and git services commented on JENA-1285:
-------------------------------------------------------
Commit 3673d6adb961b4736bdda1971833826ba9c5daa7 in jena's branch
refs/heads/master from [~andy.seaborne]
[ https://git-wip-us.apache.org/repos/asf?p=jena.git;h=3673d6a ]
JENA-1285: Merge commit 'refs/pull/213/head' of github.com:apache/jena
This closes #213.
> Have on Tokenizer token for strings.
> ------------------------------------
>
> Key: JENA-1285
> URL: https://issues.apache.org/jira/browse/JENA-1285
> Project: Apache Jena
> Issue Type: Improvement
> Components: RIOT
> Reporter: Andy Seaborne
> Assignee: Andy Seaborne
> Priority: Minor
>
> The Tokenizer ({{TokenizerText}}) faithfully records what sort of string it
> has processed using different token types - STRING1, STRING2, LONG_STRING1,
> LONG_STRING2.
> Sometimes it matters (N-Triples), sometimes it doesn't (Turtle).
> [Turtle rule for
> strings|https://www.w3.org/TR/turtle/#grammar-production-String]
> [N-Triples rule for
> strings|https://www.w3.org/TR/n-triples/#grammar-production-STRING_LITERAL_QUOTE]
> Instead of 4 tokens, (5 if you include the existing STRING token) it is
> proposed to use one token type STRING and record the actual string type seen
> separately.
> This is make working with non-text formats simpler where there are strings
> without the concept of quotes, and any format that works with any string form.
> The specific cases (e.g. N-Triples) can still test for the details of the
> string syntax seen but the token type is the conceptual "superclass" STRING
> type.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)