[ 
https://issues.apache.org/jira/browse/JENA-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Seaborne resolved JENA-2083.
---------------------------------
    Resolution: Incomplete

> Support skipping/ignoring errors with tdbloader
> -----------------------------------------------
>
>                 Key: JENA-2083
>                 URL: https://issues.apache.org/jira/browse/JENA-2083
>             Project: Apache Jena
>          Issue Type: New Feature
>          Components: TDB, TDB2
>            Reporter: Timothy Higinbottom
>            Priority: Major
>
> Hi all,
> I have a fairly large (~22,000) number of N-Triples files I hope to import 
> into TDB2 to query with Fuseki.
> I boosted the RAM allotted to the JVM and used the parallel mode from 
> tdb2.tdbloader. This whizzed through the first 1,000 of the files.
> However, some of the files are incorrectly serialized, so they caused errors 
> when Jena tried to read them. It is not feasible right now to sort out the 
> defective files from the good ones before running tdbloader.
> It would be great if tdbloader could add an option to skip the files that 
> error so that it can continue to process the other files.
> The main reason this should be part of tdbloader itself is that the 
> alternative (running xargs or a loop in Bash) decreases performance because 
> then the loading is effectively synchronous and the user can't take advantage 
> of the tdbloader modes and batching.
> Thanks for this great project!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to