Thanks for the response Andy.

So I guess the overall picture would be that I have a TDB dataset stored on
disk and I would like to query it using lucene text match like the
following:

PREFIX pf: <http://jena.hpl.hp.com/ARQ/property#>SELECT ?doc{
    ?lit pf:textMatch '+text' .
    ?doc ?p ?lit}

If I index by partitions of the dataset, can I store that to disk so I
don't have to repeat the process again?


On Sun, Mar 17, 2013 at 8:57 AM, Andy Seaborne <[email protected]> wrote:

> On 17/03/13 00:45, Martino Buffolino wrote:
>
>> Hi,
>>
>> I built a large dataset using tdbloader and now I would like to query it
>> by
>> using a lucene index. I've tried to index by using
>> larqBuilder.indexStatements(**model.listStatements()); which led to an
>> out of
>> memory exception.
>>
>
> Could you give some more details?
>
> It might be it is using up RAM for something but it might also be because
> the model has many large text literals which, combined with all the other
> uses of heap, is causing the problem, rather than LARQ per se.
>
>
>  Is there another approach to do this?
>>
>
> If it's a large database, then doing it in sections is a possibility.
>
> What might work (given I'm not sure where it is running out of memory) is
> to:
>
> Get an iterator e.g. model.listStatements() then index some selection of
> it (e.g. 1000 items), then close and reopen the index, then index another
> 1,000 items from the iterator.
>
>         Andy
>
>
>> Thanks
>>
>>
>

Reply via email to