[ 
https://issues.apache.org/jira/browse/JENA-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779939#comment-13779939
 ] 

Leigh Dodds commented on JENA-550:
----------------------------------

My assumption was that creating a tdb index using tdbloader2, into the same 
directory, would replace any existing index there. So detecting the abrupt exit 
might not be the issue, maybe that version of the loader should be recreating 
any existing index files?

>From what you're saying if a load breaks I need to manually delete any 
>existing files as the indexer will re-use what's there, even if its broken?


                
> "Impossibly Large Object" exception with command-line indexing
> --------------------------------------------------------------
>
>                 Key: JENA-550
>                 URL: https://issues.apache.org/jira/browse/JENA-550
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB
>            Reporter: Leigh Dodds
>            Priority: Minor
>
> I have a script that calls tdbloader2 to create TDB indexes then the new 
> Lucene text indexer to create indexes.
> The first step completed successfully and then whilst the text indexer was 
> running I got the following stack trace:
> ERROR 
> ObjectFileStorage.read[nodes](21694280)[filesize=32753969][file.size()=32753969]:
>  Impossibly large object : 1668246831 bytes > 
> filesize-(loc+SizeOfInt)=11059685
> com.hp.hpl.jena.tdb.base.file.FileException: 
> ObjectFileStorage.read[nodes](21694280)[filesize=32753969][file.size()=32753969]:
>  Impossibly large object : 1668246831 bytes > 
> filesize-(loc+SizeOfInt)=11059685
>       at 
> com.hp.hpl.jena.tdb.base.objectfile.ObjectFileStorage.read(ObjectFileStorage.java:346)
>       at com.hp.hpl.jena.tdb.lib.NodeLib.fetchDecode(NodeLib.java:78)
>       at 
> com.hp.hpl.jena.tdb.nodetable.NodeTableNative.readNodeFromTable(NodeTableNative.java:178)
>       at 
> com.hp.hpl.jena.tdb.nodetable.NodeTableNative._retrieveNodeByNodeId(NodeTableNative.java:103)
>       at 
> com.hp.hpl.jena.tdb.nodetable.NodeTableNative.getNodeForNodeId(NodeTableNative.java:74)
>       at 
> com.hp.hpl.jena.tdb.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:103)
>       at 
> com.hp.hpl.jena.tdb.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:74)
>       at 
> com.hp.hpl.jena.tdb.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:55)
>       at 
> com.hp.hpl.jena.tdb.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:67)
>       at com.hp.hpl.jena.tdb.lib.TupleLib.quad(TupleLib.java:161)
>       at com.hp.hpl.jena.tdb.lib.TupleLib.quad(TupleLib.java:153)
>       at com.hp.hpl.jena.tdb.lib.TupleLib.access$100(TupleLib.java:45)
>       at com.hp.hpl.jena.tdb.lib.TupleLib$4.convert(TupleLib.java:87)
>       at com.hp.hpl.jena.tdb.lib.TupleLib$4.convert(TupleLib.java:83)
>       at org.apache.jena.atlas.iterator.Iter$4.next(Iter.java:317)
>       at 
> org.apache.jena.atlas.iterator.IteratorCons.next(IteratorCons.java:97)
>       at jena.textindexer.exec(textindexer.java:125)
>       at arq.cmdline.CmdMain.mainMethod(CmdMain.java:101)
>       at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
>       at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
>       at jena.textindexer.main(textindexer.java:55)
> No other code is touching the database, so I'm not clear how the node table 
> could have gotten corrupted. 
> During a previous run of the script I got an exception because of an invalid 
> URI:
> org.apache.jena.riot.RiotException: [line: 2, col: 110] illegal escape 
> sequence value: , (0x2C)
>       at 
> org.apache.jena.riot.system.ErrorHandlerFactory$ErrorHandlerStd.fatal(ErrorHandlerFactory.java:132)
> I'm wondering whether this exception might have been the cause of the 
> corruption?
> Deleting the index directories and re-running fixed the issue

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to