Hi Andy,

Thanks for the insight! It's really helpful. will look into tdbloader.

Yuhan

On Thu, Aug 30, 2012 at 6:04 AM, Andy Seaborne <[email protected]> wrote:

> On 30/08/12 03:01, Yuhan Zhang wrote:
>
>> actually, the file I used was 260MB... I  tried something smaller than 1MB
>> and it worked.
>>
>> seems like the s-put ruby script is stream-friendly.. do I have to break
>> large files into parts?
>>   what's the recommended way to load large files?
>>
>
> The Fuseki server is defensive - it reads the body of the PUT into a
> temporary in-memory model to make sure it's going to be able to parse
> everything, then, when the end of the body is reached, it adds the graph to
> the model.
>
> All the web operations are defensive and check their inputs so that a bad
> request does not lead to half a request being serviced.
>
> TDB Transactions could overcome that to some extent but they are not fully
> scalable and Fuseki does not (currently) know that TDB transactions are
> fully ACID, so the transaction itself can be used to catch bad data and not
> mess up the database.
>
> A way to bulk load data is to load the database offline using the
> bulkloader (tdbloader).  The bulkloader knows how to cheat - it manipulates
> the internal tables directly.
>
> So either split the file up, or bulk load the database off list.
>
> If split, use DELETE or PUT empty first then POST,POST,POST to append data.
>
>         Andy
>
>>
>> Thank you.
>>
>> Yuhan
>>
>> On Wed, Aug 29, 2012 at 6:13 PM, Yuhan Zhang <[email protected]>
>> wrote:
>>
>>  Hi all,
>>>
>>> I'm experimenting with fuseki, and reached some trouble at loading data.
>>>
>>> I followed the getting started tutorial successfully with the books.ttl
>>> file. but when feeding a small .ttl file (1.2MB) from dbpedia, the script
>>> is causing system halt:
>>> http://downloads.dbpedia.org/**3.8/bg/geo_coordinates_bg.ttl.**bz2<http://downloads.dbpedia.org/3.8/bg/geo_coordinates_bg.ttl.bz2>
>>>
>>> The server was started with the following setting: fuseki-server --update
>>> --loc=/tmp/ds /ds
>>> The data was loaded in this way: ruby s-put http://localhost:3030/ds/**
>>> datadefault <http://localhost:3030/ds/datadefault>geo_coordinates_bg.ttl
>>>
>>>
>>> I'm using the latest distribution: jena-fuseki-0.2.4
>>>
>>>
>>> Thank you.
>>>
>>> Yuhan
>>>
>>>
>>
>>
>>
>


-- 
Yuhan Zhang
Senior Software Engineer
OneScreen Inc.
[email protected] <[email protected]>
www.onescreen.com
(949) 525-4825 Ext: 177


The information contained in this e-mail is for the exclusive use of the
intended recipient(s) and may be confidential, proprietary, and/or legally
privileged. Inadvertent disclosure of this message does not constitute a
waiver of any privilege.  If you receive this message in error, please do
not directly or indirectly print, copy, retransmit, disseminate, or
otherwise use the information. In addition, please delete this e-mail and
all copies and notify the sender.

Reply via email to