[
https://issues.apache.org/jira/browse/JENA-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041184#comment-13041184
]
Andy Seaborne commented on JENA-69:
-----------------------------------
I tried to apply patch JENA-69-r1128283.patch and came across a couple of
problematic areas:
1/ It changes the TSV format by adding an extra line at the start with the
number of variables.
This means the line containing the variable names is line 2 and the data starts
line 3.
The TSV format is defined to have the column names on line 1. ARQ and 4Store
currently generate compatible formats.
See
http://www.iana.org/assignments/media-types/text/tab-separated-values
Is the count needed? Coudl the app read in the first line, parser out the
variables and use that as the count?
2/ There is ResultSetStream for wrapping an iterator<Binding> of rows to get a
ResultSet.
I think it would be better for TSVInput to wrap it's iterator of Bindings with
ResultSetStream, which only promises a single pass over the results and avoid
materializing the intermediate results. It would even be possible for TSVInput
to create an iterator and not full materialize the results as it currently
does. This would help stream processing and scalability.
> Provide TSVInput reader
> -----------------------
>
> Key: JENA-69
> URL: https://issues.apache.org/jira/browse/JENA-69
> Project: Jena
> Issue Type: New Feature
> Components: ARQ, TDB
> Reporter: Laurent Pellegrino
> Priority: Blocker
> Labels: arq, resultset, tdb, tsv, tsvinput
> Attachments: JENA-69-r1128283.patch,
> removenodeclib-r1128186-version1.patch, tsvinput-r1128173-version1.patch,
> tsvinput-r1128173-version2.patch
>
>
> As stated into the mailing-list it is possible to serialize a ResultSet by
> using the TSV format. However, it is not possible to unserialize it (there is
> no TSVInput implementation).
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira