[ 
https://issues.apache.org/jira/browse/JENA-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne resolved JENA-1325.
---------------------------------
    Resolution: Won't Fix
      Assignee: Andy Seaborne

> RIOT parse many files at once, output only valid ones
> -----------------------------------------------------
>
>                 Key: JENA-1325
>                 URL: https://issues.apache.org/jira/browse/JENA-1325
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: RIOT
>         Environment: GNU/Linux
>            Reporter: Laura
>            Assignee: Andy Seaborne
>
> This issue is more or less related to this other one 
> https://issues.apache.org/jira/browse/JENA-1322
> I have a folder with thousands of files, mostly small RDF/XML files. I'm 
> using RIOT to validate them and dump the valid ones into ntriples files. The 
> problem is that calling RIOT on each file is not going to cut it. The 
> overhead is significant enough that this operation is just too slow (hours). 
> So I've tried to call RIOT only once on all files together using
> {noformat}
>     riot \
>         --verbose \
>         --stop \
>         --check \
>         --strict \
>         --output=nt \
>         files/*.rdf > files.nt
> {noformat}
> and in this way validation is much faster. The problem is, that it's still 
> dumping invalid files to the .nt output file. I'm downloading these files 
> from the Internet, so I'm not going to fix them myself, I just want to skip 
> bad files.
> Now, to be clear, I understand that RIOT is of course not meant to fix bad 
> data, and I'm not asking for this. I'm suggesting however to add an 
> *--option* such that RIOT can do the following:
> 1. parse multiple files at once (so that there is no need to invoke the same 
> RIOT command for each file)
> 2. for every file, check/validate it
> 3. if *--output* is set, only output those files or triples that didn't raise 
> any ERROR
> I think this is well in the scope of RIOT functionalities. Could this option 
> please be added to RIOT?
> Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to