Okay, I’ll do a bit of polish and commenting and get a PR in.
I hadn’t even thought about the assembler system— well, it’s a chance to learn
about another part of Jena! {grin}
---
A. Soroka
The University of Virginia Library
> On Oct 7, 2015, at 9:35 AM, Andy Seaborne <[email protected]> wrote:
>
> On 06/10/15 18:28, A. Soroka wrote:
>> Andy— would it be appropriate at this time to issue a PR on this Dexx-based
>> branch, so that other people can more easily comment on it?
>>
>
> Good idea. I haven't looked in depth at changes across the rest of
> base/core/arq and a PR will be easier to find those places.
>
> There will also need to be an assembler at some time, including ways to
> initialize it from loading files. The current ja:RDFDataset is definitely
> centrer around the concept of building a dataset from models.
>
> Andy
>
>> ---
>> A. Soroka
>> The University of Virginia Library
>>
>>> On Oct 4, 2015, at 5:37 AM, Andy Seaborne <[email protected]> wrote:
>>>
>>> On 29/09/15 15:00, A. Soroka wrote:
>>>> On Sep 27, 2015, at 5:41 AM, Andy Seaborne <[email protected]> wrote
>>>>> I can't try out your new stuff for a few days due to not being near
>>>>> a suitable computer.
>>>>
>>>> No problem. On my machine using Dexx, that port of the Scala types,
>>>> the branch shows improvement to within half of the stock performance.
>>>
>>> Excellent. That's looking very good. It's does something so it's going to
>>> cost something.
>>>
>>> My figures below on same hardware as before - the txn/non-txn is making a
>>> difference now.
>>>
>>> Licensing-wise, Dexx is MIT (with maybe some BSD-isms from Scala) which is
>>> no problem.
>>>
>>>> I have tried now with some variations using the Clojure types (shown
>>>> after my sig) and didn’t see much difference, so I’ll leave that
>>>> question alone for the moment. I wasn’t able to use Clojure’s
>>>> transient (mutate-in-place-within-a-thread/transaction)
>>>> functionality, because Clojure transients do not afford iteration,
>>>> which is needed to support find(). It seems feasible to me that a
>>>> custom implementation with the ability to use mutate-in-place within
>>>> transactions might offer more improvement, but that’s a whole ‘nuther
>>>> kettle of fish.
>>>>
>>>> I’ll spend some time soon moving on with the Dexx branch and trying
>>>> out some simple tests of the kind you’ve outlined below (and I’ll
>>>> include something that exercises property paths, which actually
>>>> happen to be very interesting for a few use cases in which I am
>>>> interested). I’m not sure how to engage real world use very
>>>> effectively. I can certainly spin up examples, but it seems like we
>>>> would want a broader set of users than just me to try it out, no?
>>>> {grin}
>>>
>>> That would be ideal but it's not always easy to do. Email to users@
>>> possibly with a quite large notice saying people are affected.
>>>
>>> I think the problem areas are around adding inference graphs to general
>>> datasets, not the details of this new dataset implementation.
>>>
>>> Discussion/proposal:
>>>
>>> * Add this as DatasetFactory.createTxnMem(),
>>> * Add DatasetFactory.createGeneral()
>>> * ?? Deprecate DatasetFactory.createMem(),
>>> referring to createTxnMem() and createGeneral()
>>> (other clearing up of DatasetFactory ...)
>>> * Release.
>>>
>>>
>>> Andy
>>>
>>>>
>>>> --- A. Soroka The University of Virginia Library
>>>
>>> 2015-01-03:
>>> jena-624-dexx branch:
>>>
>>> ==== Data: /home/afs/Datasets/BSBM/bsbm-1m.nt.gz ====
>>> Size: 1,000,312 (3.253s, 307,504 tps)
>>> ==== DSG/mix/auto (warm N=3)
>>> ==== DSG/mix/txn (warm N=3)
>>> ==== DSG/mem/auto (warm N=3)
>>> ==== DSG/mem/txn (warm N=3)
>>> ==== DSG/mix/auto (N=20)
>>> ==== DSG/mix/auto (N=20) Time: 81.064s (246,795 tps)
>>> ==== DSG/mix/txn (N=20)
>>> ==== DSG/mix/txn (N=20) Time: 80.412s (248,796 tps)
>>> ==== DSG/mem/auto (N=20)
>>> ==== DSG/mem/auto (N=20) Time: 230.129s (86,934 tps)
>>> ==== DSG/mem/txn (N=20)
>>> ==== DSG/mem/txn (N=20) Time: 129.259s (154,776 tps)
>>
>