Hi Andy, Thanks for the insight! It's really helpful. will look into tdbloader.
Yuhan On Thu, Aug 30, 2012 at 6:04 AM, Andy Seaborne <[email protected]> wrote: > On 30/08/12 03:01, Yuhan Zhang wrote: > >> actually, the file I used was 260MB... I tried something smaller than 1MB >> and it worked. >> >> seems like the s-put ruby script is stream-friendly.. do I have to break >> large files into parts? >> what's the recommended way to load large files? >> > > The Fuseki server is defensive - it reads the body of the PUT into a > temporary in-memory model to make sure it's going to be able to parse > everything, then, when the end of the body is reached, it adds the graph to > the model. > > All the web operations are defensive and check their inputs so that a bad > request does not lead to half a request being serviced. > > TDB Transactions could overcome that to some extent but they are not fully > scalable and Fuseki does not (currently) know that TDB transactions are > fully ACID, so the transaction itself can be used to catch bad data and not > mess up the database. > > A way to bulk load data is to load the database offline using the > bulkloader (tdbloader). The bulkloader knows how to cheat - it manipulates > the internal tables directly. > > So either split the file up, or bulk load the database off list. > > If split, use DELETE or PUT empty first then POST,POST,POST to append data. > > Andy > >> >> Thank you. >> >> Yuhan >> >> On Wed, Aug 29, 2012 at 6:13 PM, Yuhan Zhang <[email protected]> >> wrote: >> >> Hi all, >>> >>> I'm experimenting with fuseki, and reached some trouble at loading data. >>> >>> I followed the getting started tutorial successfully with the books.ttl >>> file. but when feeding a small .ttl file (1.2MB) from dbpedia, the script >>> is causing system halt: >>> http://downloads.dbpedia.org/**3.8/bg/geo_coordinates_bg.ttl.**bz2<http://downloads.dbpedia.org/3.8/bg/geo_coordinates_bg.ttl.bz2> >>> >>> The server was started with the following setting: fuseki-server --update >>> --loc=/tmp/ds /ds >>> The data was loaded in this way: ruby s-put http://localhost:3030/ds/** >>> datadefault <http://localhost:3030/ds/datadefault>geo_coordinates_bg.ttl >>> >>> >>> I'm using the latest distribution: jena-fuseki-0.2.4 >>> >>> >>> Thank you. >>> >>> Yuhan >>> >>> >> >> >> > -- Yuhan Zhang Senior Software Engineer OneScreen Inc. [email protected] <[email protected]> www.onescreen.com (949) 525-4825 Ext: 177 The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly print, copy, retransmit, disseminate, or otherwise use the information. In addition, please delete this e-mail and all copies and notify the sender.
