Re: Apache Jena tdbloader performance and limits

2020-05-21 Thread Dick Murray
ect. > > What kind of tuning besides the hardware was effective for you? > > Does anybody have experience with partial dumps created by > https://tools.wmflabs.org/wdumps/? > > Cheers > > Wolfgang > > Am 20.05.20 um 11:22 schrieb Dick Murray: > > That's a

Re: Apache Jena tdbloader performance and limits

2020-05-20 Thread Dick Murray
dy have experience with partial dumps created by > https://tools.wmflabs.org/wdumps/? > > Cheers > > Wolfgang > > Am 20.05.20 um 11:22 schrieb Dick Murray: > > That's a blast from the past! > > > > Not all of the details from that exchange are on the Jean list

Re: Apache Jena tdbloader performance and limits

2020-05-20 Thread Dick Murray
che Jena users, > > Some 2 years ago Laura Morlaes and Dick Murray had an exchange on this > list on how to influence the performance of > tdbloader. The issue is currently of interest for me again in the context > of trying to load some 15 billion triples from a > copy o

Detecting writes natively to a DatasetGraph since a particular epoch

2019-10-23 Thread Dick Murray
Hi. Is it possible to natively detect whether a write has occurred to a DatasetGraph since a particular epoch? For the purposes of caching if I perform an expensive read from a DatasetGraph knowing whether I need to invalidate the cache is very useful. Does TDB or the Mem natively track if a

Re: sparql 1.4 billion triples

2018-12-16 Thread Dick Murray
Be very careful using vmtouch especially if you call -dl as you could very easily and quickly kill a system. I've used this tool on cloud VM's to mitigate cycle times, think DBAN due to public nature of hardware. It's a fast way to an irked OS thrashing around. Dick On Sun, 16 Dec 2018 19:57

Re: Multiple Fuseki Servers in Distributed Environment

2018-06-01 Thread Dick Murray
ems that distribute SPARQL using Jena. > > Dick Murray has written a system called Mosaic that (I believe) uses > Apache Thrift to distribute the lower-level (DatasetGraph) primitives that > ARQ uses to execute SPARQL. An advantage over your plan might be that he > isn't serializing

Re: TDB2 and bulk loading

2018-03-19 Thread Dick Murray
Slow needs to be qualified. Slow because you need to load 1MT in 10s? What hardware? What environment? Are you loading a line based serialization? Are you loading from scratch or appending? D On Mon, 19 Mar 2018, 10:51 Davide, wrote: > Hi, > What is the best way to perform

Re: client/server communication protocol

2018-03-13 Thread Dick Murray
>From an enterprise perspective http is well supported with years of development in associated stacks, such as load balancing etc. It also allows Devs to use different languages. That said we also employ Thrift based DGs which allow direct access from Python etc. It doesn't remove the overhead, it

Re: Best way to save a large amount of triples in TDB

2018-03-12 Thread Dick Murray
On Mon, 12 Mar 2018, 09:27 Davide Curcio, wrote: > Hi, > I want to store a large amount of data inside the TDB server with the > Quantity or size on disk? Jena API. In my code, I retrieve data for each iteration, and so I need > to store these data in TDB, but if I

Re: PrefixMapStd abbrev call to strSafeFor understanding

2018-01-19 Thread Dick Murray
recreate the node. On 19 Jan 2018 13:56, "Andy Seaborne" <a...@apache.org> wrote: On 18/01/18 16:48, Dick Murray wrote: > Is it possible to get a Pair<String, String> lexvo (left) code/002 (right) > from abbrev given the prefix map entry; > In Turt

PrefixMapStd abbrev call to strSafeFor understanding

2018-01-18 Thread Dick Murray
Is it possible to get a Pair lexvo (left) code/002 (right) from abbrev given the prefix map entry; lexvo http://lexvo.org/id/ and the URI; http://lexvo.org/id/code/002 PrefixMapStd (actually base call) returns null because the call to; protected Pair

Re: Is There Any Way to Shorten The Waiting Time After Upload Triples in Jena?

2017-12-26 Thread Dick Murray
That's one graph in many pieces and the owner of the graph should clearly state what is what! On 26 Dec 2017 20:28, "Laura Morales" wrote: > Blank node identifiers are only limited in scope to a serialization of a > particular RDF graph, i.e. the node _:b does not represent

Re: Is There Any Way to Shorten The Waiting Time After Upload Triples in Jena?

2017-12-26 Thread Dick Murray
On 26 Dec 2017 19:10, "Laura Morales" wrote: > What is more, it gets bNode labels across files right (so using _:a in > two files is two bNodes). Thinking about this... - if the files contain anonymous blank nodes (for example in Turtle), each node (converted with RIOT)

Re: Is There Any Way to Shorten The Waiting Time After Upload Triples in Jena?

2017-12-25 Thread Dick Murray
That seems slow for the size. We bulk load triples into Windows and get similar times to Centos/Fedora on the same hardware. You can hack the tdbloader2 to run on Windows as basically you're exploiting the OS sort which on Windows is; *sort* [*/r*] [*/+**n*] [*/m* *kilobytes*] [*/l* *locale*]

Re: Operational issues with TDB

2017-12-22 Thread Dick Murray
How big? How many? On 22 Dec 2017 8:37 pm, "Dimov, Stefan" wrote: > Hi all, > > We have a project, which we’re trying to productize and we’re facing > certain operational issues with big size files. Especially with copying and > maintaining them on the productive cloud

Re: Very very slow query when using a high OFFSET

2017-12-18 Thread Dick Murray
On 18 December 2017 at 08:07, Laura Morales wrote: > > The don't have index permutations spo, ops, pos, etc. > > Yes they have, what you're saying is wrong. See http://www.rdfhdt.org/hdt- > binary-format/#triples That's what the .hdt.index file is about, to store > more index

Re: Report on loading wikidata

2017-12-12 Thread Dick Murray
n but eventually I'll saturate it. Sent: Tuesday, December 12, 2017 at 9:20 PM From: "Dick Murray" <dandh...@gmail.com> To: users@jena.apache.org Subject: Re: Report on loading wikidata tdbloader2 For anyone still following this thread ;-) latest-truthy supposedly

Re: Report on loading wikidata

2017-12-12 Thread Dick Murray
Correct, Mosaic federates multiple datasets as one. At some point in a query find [G]SPO will get called and Mosaic will concurrently call find on each child dataset and return the set of results. The dataset can be memory or TDB or Thrift (this one's another discussion) Mosaic doesn't care as

Re: Avoid exception In the middle of an alloc-write

2017-12-12 Thread Dick Murray
We "hand" a transaction around using a ThreadProxy, which is basically a wrapper around an ExecutorService which does one thing at a time. You create it then give it to one or more threads which submit things to do and it returns Future's. We extend it to implement Transactional so it works with

Re: Report on loading wikidata

2017-12-12 Thread Dick Murray
Sent: Monday, December 11, 2017 at 11:31 AM From: "Dick Murray" <dandh...@gmail.com> To: users@jena.apache.org Subject: Re: Report on loading wikidata Inline... On 10 December 2017 at 23:03, Laura Morales <laure...@mail.com> wrote: > Thank you a lot Dick! Is this

Re: Report on loading wikidata

2017-12-12 Thread Dick Murray
Understand, I'm running sort and uniq on truthy out of interest... On 12 December 2017 at 10:31, Andy Seaborne <a...@apache.org> wrote: > > > On 12/12/17 10:06, Dick Murray wrote: > ... > >> As an aside there are duplicate entries in the data-triples.tmp file, is >

Re: Report on loading wikidata

2017-12-12 Thread Dick Murray
ms with tdbloader2 with complex --sort-args (it > only handles one single arg/value correctly). My main trick was to put in > a script for "sort" that had the required settings built-in. I wanted to > set --compress, -T and the buffer size. > > On 10/12/17 21:18, Dick Mu

Re: Report on loading wikidata

2017-12-11 Thread Dick Murray
Inline... On 10 December 2017 at 23:03, Laura Morales wrote: > Thank you a lot Dick! Is this test for tdbloader, tdbloader2, or > tdb2.tdbloader? > > > 32GB DDR4 quad channel > > 2133 or higher? > 2133 > > 3 x M.2 Samsung 960 EVO > > Are these PCI-e disks? Or SATA? Also,

Re: Report on loading wikidata

2017-12-10 Thread Dick Murray
Ryzen 1920X 3.5GHz, 32GB DDR4 quad channel, 3 x M.2 Samsung 960 EVO, 172K/sec 3h45m for truthy. Is it possible to split the index files into separate folders? Or sym link the files, if I run the data phase, sym link, then run the index phase? Point me in the right direction and I'll extend the

Re: TDB Loader 2 and TDB2 Loader

2017-12-06 Thread Dick Murray
nd tdbloader2data. > > ajs6f > > > On Dec 6, 2017, at 2:50 PM, Dick Murray <dandh...@gmail.com> wrote: > > > > TDB Loader 2, where does it call the Unix sort please? I'm obviously > > looking too hard! > > > > TDB2 Loader does a simple .add(Quad)? I'm not missing something? > > > > Dick. > >

TDB Loader 2 and TDB2 Loader

2017-12-06 Thread Dick Murray
TDB Loader 2, where does it call the Unix sort please? I'm obviously looking too hard! TDB2 Loader does a simple .add(Quad)? I'm not missing something? Dick.

Re: tdb2.tdbloader performance

2017-12-02 Thread Dick Murray
Hello. On 2 Dec 2017 8:55 pm, "Andy Seaborne" wrote: Short story I used the following "reasonable" device > > Dell M3800 > Fedora 27 > 16GB SODIMM DDR3 Synchronous 1600 MHz > CPU cache L1/256KB,L2/1MB,L3/6MB > Intel(R) Core(TM) i7-4702HQ CPU @ 2.20GHz

Re: tdb2.tdbloader performance

2017-12-01 Thread Dick Murray
Hi. Sorry for the delay :-) Short story I used the following "reasonable" device Dell M3800 Fedora 27 16GB SODIMM DDR3 Synchronous 1600 MHz CPU cache L1/256KB,L2/1MB,L3/6MB Intel(R) Core(TM) i7-4702HQ CPU @ 2.20GHz 4 cores 8 threads to load part of the latest-truthy.nt from

Re: tdb2.tdbloader performance

2017-11-28 Thread Dick Murray
LOL, there's lots of things where I'd like to "move the problem elsewhere". I've achieved concurrent 120K on the server hardware but it depends on the input. There's another recent Jena thread regarding sizing and that's tied up with what's in the input. I see the same thing with loading data,

Jena 3.2.0-rc1 issue

2017-05-15 Thread Dick Murray
This is probably me but... I've got a collection of import errors in my Jena 3.2.0-rc1 fork, the common issue being the import prefix "org.apache.jena.ext"... i.e. import org.apache.jena.ext.com.google.common.cache.Cache ; in jena-arq FactoryRDFCaching I've checked the github apache jena

Re: Materialize query

2017-04-26 Thread Dick Murray
I've seen this type of statement in regard to Oracle whereby a materialized query is disk based and updated periodically based on the query. It's useful in BI where you don't require the latest data. As to RDF the closest I can parallel is persisting inference (think RDFS subclass of i.e. A -> B

Re: Predicates with no vocabulary

2017-04-12 Thread Dick Murray
It is for this reason that I use and as a nod to my Cisco engineer days and example.org... :-) As Martynas Jusevičius said give it a little thought. On 12 April 2017 at 17:37, Martynas Jusevičius wrote: > It would not be an error as long it is a valid URI. > >

Re: Predicates with no vocabulary

2017-04-12 Thread Dick Murray
I use "urn:ex:..." in a lot of my test code (short for "urn:example:"). Then the predicate is "urn:ex:time/now" or "urn:ex:time/duration" or whatever you need... On 12 April 2017 at 09:49, Laura Morales wrote: > > The question is a bit unclear. If there is no existing

Re: Binary protocol

2017-04-05 Thread Dick Murray
I think that worked; wants to merge 1 commit into apache:master from dick-twocows:master if so I'll compile the Thrift file and commit that too... On 5 April 2017 at 19:28, Andy Seaborne <a...@apache.org> wrote: > Should be - let's try it! > > Andy > > > On 05/

Re: Binary protocol

2017-04-05 Thread Dick Murray
the appropriate handler to transform and load. On 5 April 2017 at 12:54, Andy Seaborne <a...@apache.org> wrote: > > > On 04/04/17 20:26, Dick Murray wrote: > >> I'd be happy to supply the current code we have, just need to get the >> current project delivered (classic

Re: Why we need Fuseki

2017-04-04 Thread Dick Murray
..@apache.org> wrote: On 04/04/17 19:02, Dick Murray wrote: > Slightly lateral on the topic but we use a Thrift endpoint compiled against > Jena to allow multiple languages to use Jena. Think interface supporting > sparql, sparul and bulk load... > I'd like to put in binary vers

Re: Why we need Fuseki

2017-04-04 Thread Dick Murray
Slightly lateral on the topic but we use a Thrift endpoint compiled against Jena to allow multiple languages to use Jena. Think interface supporting sparql, sparul and bulk load... On 3 Apr 2017 6:36 pm, "Martynas Jusevičius" wrote: > By using uniform protocols such as

Re: Jena scalability

2017-03-26 Thread Dick Murray
On 26 Mar 2017 5:20 pm, "Laura Morales" wrote: - Is Jena a "native" store? Or does it use some other RDBMS/NoSQL backends? It has memory, TDB and SDB (I'm not sure of the current state) - Has anybody ever done tests/benchmarks to see how well Jena scales with large datasets

Re: Understanding DatasetGraph getLock() (DatasetGraphInMem throwing a curve ball)...

2017-03-24 Thread Dick Murray
rpose. I'm not actually sure we have a good > non-blocking method for your use right now. We have inTransaction(), but > that's not too helpful here. > > But someone else can hopefully point to a technique that I am missing. > > > --- > A. Soroka > The University of Virginia

Understanding DatasetGraph getLock() (DatasetGraphInMem throwing a curve ball)...

2017-03-24 Thread Dick Murray
Hi. Is there a way to get what Transactional a DatasetGraph is using and specifically what Lock semantics are in force? As part of a distributed DatasetGraph implementation I have a DatasetGraphTry wrapper which adds Boolean tryBegin(ReadWrite) and as the name suggests it will try to lock the

DatasetGraph, Context serialization and thrift implementation, BNode distribution/collision.

2017-03-03 Thread Dick Murray
Hi. Question regarding the design thoughts behind Context and the callbacks. Also merging BNodes... I have implemented a Thrift based RPC DatasetGraph consisting of a Client (implements DatasetGraph) which forwards calls to an IFace (generated from a Thrift file which closely mimics the

Re: Release vote : 3.2.0

2017-02-01 Thread Dick Murray
much less of an issue to > find Linux testers. Windows seems to be generally the hardest platform to > get results for. I certainly didn't intend any more than that, but I copied > that list from earlier release vote announcements. (!) > > But maybe I am missing some history? > > a

Re: Release vote : 3.2.0

2017-02-01 Thread Dick Murray
Hi. Under checking Windows and Mac OS's are listed but not Linux. Is Jena assumed to pass? I'mean running Jena 3.2 snapshot on Ubuntu 16.04 and Centos 7. If you haven't broken anything in the snapshot then I vote release. ;-) On 1 Feb 2017 16:09, "A. Soroka" wrote: >

Re: Jena 3.2.0-SNAPSHOT Node.ANY serialization causes StackOverFlow when called from Kryo JavaSerialization.

2017-01-23 Thread Dick Murray
by accident because it has a field. > > Initialization is in org.apache.jena.system.SerializerRDF, which is > called from InitRIOT which is called by by system initialization based on > ServiceLoader. > > Andy > > > On 20/01/17 18:35, Dick Murray wrote: > >> Whilst this

Jena 3.2.0-SNAPSHOT Node.ANY serialization causes StackOverFlow when called from Kryo JavaSerialization.

2017-01-20 Thread Dick Murray
Whilst this issue is reported and possibly caused by Kryo I think it's my understanding of how Jena is or is not serializing... I'm using Jena 3.2.0-SNAPSHOT and Kryo(Net) to serialize Jena nodes but Kryo baulks when asked to handle a (the) Node_ANY; Exception in thread "Server"

Re: How I handle "Null Pointer Exception"

2017-01-18 Thread Dick Murray
Sorry, that should have been "not" asked on the Jena user group... On 18 Jan 2017 7:09 pm, "Dick Murray" <dandh...@gmail.com> wrote: You need to learn the difference between == and .equals(). Please read up on basic Java skills! These questions should be as

Re: How I handle "Null Pointer Exception"

2017-01-18 Thread Dick Murray
You need to learn the difference between == and .equals(). Please read up on basic Java skills! These questions should be asked on the Jena user group... On 18 Jan 2017 1:14 pm, "Sidra shah" wrote: Hello Lorenz, its not giving me the exception now but it does not

Re: What are the Alternatives of DBpedia

2017-01-15 Thread Dick Murray
Google for example RDF datasets in a serialisation supported by Jena. A web search really is your best friend for this... On 15 Jan 2017 3:13 pm, "kumar rohit" wrote: I want to know, like DBpdia, what are other sources where we can get data from and supported also by

Re: Semantic Of Jena rule

2017-01-12 Thread Dick Murray
An example rule which you can test and then expand on is; [Manager: (?E rdf:type NS:Employee), (?E NS:netSalary ?S), greaterThan (?S, 5000) -> (?X rdf:type NS:Manager)] Also see https://jena.apache.org/documentation/inference/ On 12 Jan 2017 19:15, "tina sani" wrote:

Re: Jena and Spark and Elephas

2016-12-22 Thread Dick Murray
x10 compression so applied > to RDD data I'd expect that or more. > > There are line based output formats (I don't know if they work with > Elephas - no reason why not in principle). > > http://jena.apache.org/documentation/io/rdf-output.html# > line-printed-formats > > Se

Re: Jena and Spark and Elephas

2016-12-21 Thread Dick Murray
a The University of Virginia Library > On Dec 21, 2016, at 2:17 PM, Dick Murray <dandh...@gmail.com> wrote: > > Hi, on a similar vein I have a modified NTriple reader which uses a prefix > file to reduce the file size. Whilst the serialisation allows parallel > processi

Re: Jena and Spark and Elephas

2016-12-21 Thread Dick Murray
Hi, on a similar vein I have a modified NTriple reader which uses a prefix file to reduce the file size. Whilst the serialisation allows parallel processing in spark the file sizes were large and this has reduced them to 1/10 the original size on average. There is not an existing line based

Re: Jena and Spark and Elephas

2016-12-17 Thread Dick Murray
Excellent, I was currently wrapping and unwrapping as Strings which fixed another issue along with prefixing bnodes to remove clashes between TDB's. I'll pull and refactoring my code... On 17 Dec 2016 20:03, "Andy Seaborne" wrote: Related: Jena now provides "Serializable" for

Re: Queries as functions

2016-12-17 Thread Dick Murray
Just posted a question regarding Spark because I'm heading down the streaming route as we're aggregating multiple large datasets together and our 1.5TB TDB was causing us some issues. We have many large graph writes of between 1-4Mb triples which I currently write to a number of TDB's and use a

Re: Unsupported major

2016-10-19 Thread Dick Murray
s available in sub shells. > > export JAVA_HOME= /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java > > I have read an issue (JENA-1035), that fuseki-server script ignores > JAVA_HOME variable while it executes the "java" command. > Has it been fixed? > Unknown, as i d

Re: Unsupported major

2016-10-19 Thread Dick Murray
t JAVA_HOME=c:\jre8 > Thanks and best regards, > Sandor > > > Am 19.10.2016 um 15:36 schrieb Dick Murray: > >> Hi. >> >> Check what version of JRE you have with java -version >> >> dick@Dick-M3800:~$ java -version >> java version "1.8.0_1

Re: Unsupported major

2016-10-19 Thread Dick Murray
Hi. Check what version of JRE you have with java -version dick@Dick-M3800:~$ java -version java version "1.8.0_101" Java(TM) SE Runtime Environment (build 1.8.0_101-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode) Your exception should say what version it is having trouble

Re: Concurrent read with unmatched optional clause causes exception in Jena 3.1.

2016-10-18 Thread Dick Murray
le}/log4j.properties Is it worth noting this behaviour somewhere? Thanks for the point in the right direction. Dick. On 17 October 2016 at 21:33, Andy Seaborne <a...@apache.org> wrote: > > > On 17/10/16 21:13, Dick Murray wrote: > >> Hi. >> >> On 17 Oct 2016 18:16, &qu

Re: Concurrent read with unmatched optional clause causes exception in Jena 3.1.

2016-10-17 Thread Dick Murray
On 17 Oct 2016 21:33, "Andy Seaborne" <a...@apache.org> wrote: > > > > On 17/10/16 21:13, Dick Murray wrote: >> >> Hi. >> >> On 17 Oct 2016 18:16, "Andy Seaborne" <a...@apache.org> wrote: >>> >>> >>> Are

Re: Concurrent read with unmatched optional clause causes exception in Jena 3.1.

2016-10-17 Thread Dick Murray
On 17 Oct 2016 21:33, "Andy Seaborne" <a...@apache.org> wrote: > > > > On 17/10/16 21:13, Dick Murray wrote: >> >> Hi. >> >> On 17 Oct 2016 18:16, "Andy Seaborne" <a...@apache.org> wrote: >>> >>> >>> Are

Concurrent read with unmatched optional clause causes exception in Jena 3.1.

2016-10-17 Thread Dick Murray
Hi. I'm getting odd behaviour in Jena when I execute the same query concurrently. The query has an optional which is unmatched but which appears to cause a java.lang.String exception from the atlas code. This only happens if multiple queries are submitted concurrently and closely. On a "fast"

Stall when committing a write transaction.

2016-08-08 Thread Dick Murray
-08-08T10:05:04.216Z]/[PT1M50.931S]] Value [org.iungo.result.Result] *Dick Murray* Technology Specialist *Business Collaborator Limited* 9th Floor, Reading Bridge House, George Street, Reading, RG1 8LS, United Kingdom T 0044 7884 111729 *|* E dick.mur...@groupbc.com <alistair.wa...@g

Re: Jena TDB OOME GC overhead limit exceeded

2016-07-27 Thread Dick Murray
:10, "Andy Seaborne" <a...@apache.org> wrote: > > On 27/07/16 13:19, Dick Murray wrote: >> >> ;-) Yes I did. But then I switched to the actual files I need to import and >> they produce ~3.5M triples... >> >> Using normal Jena 3.1 (i.e. no speci

Re: Jena TDB OOME GC overhead limit exceeded

2016-07-27 Thread Dick Murray
... Just before hitting send I'm at pass 13 and the [B maxed at just over 4Gb before dropping back to 2Gb. Dick. On 27 July 2016 at 11:47, Andy Seaborne <a...@apache.org> wrote: > On 27/07/16 11:22, Dick Murray wrote: > >> Hello. >> >> Something doesn't add up h

Re: Jena TDB OOME GC overhead limit exceeded

2016-07-27 Thread Dick Murray
[I 8: 310 27899112 [Ljava.util.HashMap$Node; 9:935412 22449888 java.lang.Long 10:328196 18378976 java.nio.ByteBufferAsIntBufferB 2016-07-27T09:52:49.082Z begin WRITE Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "ma

Re: Jena TDB OOME GC overhead limit exceeded

2016-07-26 Thread Dick Murray
t gave up on F8 after counting to 500 >>> >>> Dick. >>> >> >> Make sure you have all the dependencies successfully resolved with >> mvn -o dependency:tree. >> >> The Apache snapshot repo was having a bad day earlier and Multiset is >> from

Re: Jena TDB OOME GC overhead limit exceeded

2016-07-26 Thread Dick Murray
rted to RDF, and loaded with tdbloader? > > If TDB is using DiretcByteBuffersm have you set > "transactionJournalWriteBlockMode" to "direct"? > > You need to increase the direct memory space, not the heap. > > Andy > > > On 26/07/16 10:14, Dick Murray

Jena TDB OOME GC overhead limit exceeded

2016-07-26 Thread Dick Murray
Hi. I've got a repeatable problem with Jena 3.1 when performing a bulk load. The bulk load converts a file into ~200k quads and adds them to a TDB instance within a normal begin write, add quads and commit. Initially this completes in 30-40 seconds, However if I repeat the process (with the same

Re: SPI DatasetGraph creating Triples/Quads on demand using DatasetGraphInMemory

2016-04-01 Thread Dick Murray
Hi. I've pushed up a draft to https://github.com/dick-twocows/jena-dev.git. This has two test cases; Echo : which will echo back the find GSPO call i.e. call find ABCD and you will get the Quad ABCD back. This does not cache between calls. CSV : which will transform a CSV file into Quads i.e.

Re: SPI DatasetGraph creating Triples/Quads on demand using DatasetGraphInMemory

2016-03-18 Thread Dick Murray
gt; ... but it seems that its returning as a factor > > inline lambdas are apparently faster than the same code with a class > implementation - the compiler emits an invokedynamic for the lanmbda > > and Java Stream can cause a lot of short-lived objects. > > Andy >

Re: SPI DatasetGraph creating Triples/Quads on demand using DatasetGraphInMemory

2016-03-15 Thread Dick Murray
Eureka moment! It returns a new Graph of a certain type. Whereas I need the graph node to determine where the underlying data is. Cheers Dick. On 15 March 2016 at 11:28, Andy Seaborne <a...@apache.org> wrote: > On 15/03/16 10:30, Dick Murray wrote: > >> Sorry, supportsTransac

Re: SPI DatasetGraph creating Triples/Quads on demand using DatasetGraphInMemory

2016-03-15 Thread Dick Murray
age >> From: Andy Seaborne <a...@apache.org> >> Date: 13/03/2016 7:54 pm (GMT+00:00) >> To: users@jena.apache.org >> Subject: Re: SPI DatasetGraph creating Triples/Quads on demand using >>DatasetGraphInMemory >> >> On 10/03/16 20:10, Dic

Re: SPI DatasetGraph creating Triples/Quads on demand using DatasetGraphInMemory

2016-03-10 Thread Dick Murray
s, Node p, Node o) >{ return find(s,p,o).findAny().isPresent() ; } >default boolean contains(Node g, Node s, Node p, Node o) >{ return find(g,s,p,o).findAny().isPresent() ; } > > // Prefixes ?? > } > > > https://github.com/afs/AFS-Dev/tree/master/src/main/java/projects/dsg

Re: SPI DatasetGraph creating Triples/Quads on demand using DatasetGraphInMemory

2016-03-04 Thread Dick Murray
ropriately-ordered index based on the fixed and variable > slots in the find pattern and using the concrete methods above to stream > tuples back. > >>>> > >>>> As to why you are seeing your methods called in some places and not > in others, DatasetGraphBaseFi

POM issue with 3.0.1

2016-02-05 Thread Dick Murray
Hi I'm trying to get the 3.0.1 to build using eclipse/maven but it refuses to "find" it in the central repository. I ran mvn dependency:get -Dartifact=org.apache.jena:apache-jena-libs:jar:3.0.1 and got the following... Am I missing something..? [INFO] Resolving

Re: Inserting large volumes into a RW TDB store.

2014-10-21 Thread Dick Murray
I might be confusing the DynamicDataset... Dick On 20 October 2014 20:40, Dick Murray dandh...@gmail.com wrote: Thanks that confirms what I thought. Crazy idea time! Am I correct in thinking that there is a dataset view which allows you to present multiple datasets as one? I'm sure I saw

Inserting large volumes into a RW TDB store.

2014-10-20 Thread Dick Murray
Hello all. Are there any pointers to inserting large volumes of data in a persistent RW TDB store please? I currently have a 8M line 500MB+ input file which is being parsed by JavaCC and the created quads inserted into a TDB store. The process genreates 120M quads and takes just over 2hrs which

Re: Inserting large volumes into a RW TDB store.

2014-10-20 Thread Dick Murray
estimated that it will grow by the same amount every working day which equates to 31,200M or 31B triples and 13,780GB or 14TB on disk in a year... Dick On 20 Oct 2014 17:56, Andy Seaborne a...@apache.org wrote: On 20/10/14 10:12, Dick Murray wrote: Hello all. Are there any pointers to inserting

Re: Dynamic graph/model inference within a select.

2013-06-19 Thread Dick Murray
, Andy Seaborne a...@apache.org wrote: On 18/06/13 18:22, Dick Murray wrote: I'm looking for dynamic inference based on the select. Given a dataset with multiple named graphs I would like the ability to wrap specific named graphs based on some form of filter when the select is processed

Re: Dynamic graph/model inference within a select.

2013-06-18 Thread Dick Murray
together would be good. It seems to be querying the dataset without the inference graph. I don't see where you query the dataset (and which one) if (graphNode.getURI().equals(**types.getURI())) { if (graphNode.equals(types.**asNode()) { On 18/06/13 14:22, Dick Murray wrote: Hi

Re: Issue adding Triples to Graph which has been added to a DatasetGraph.

2012-11-30 Thread Dick Murray
. update before adding. Andy A TDB dataset is a triples+quad store. You can't add an in-memory storage-backed graph to a dataset backed by TDB. If you want a mixe dataset, you can create an in-memory dataset and add in TDB backed graphs. Andy On 28/11/12 20:32, Dick Murray

Re: Extending UpdateVisitor to provide security within visit(UpdateVisitor visitor).

2012-10-30 Thread Dick Murray
); } } Hope this helps, Rob On 10/29/12 6:43 AM, Dick Murray dandh...@gmail.com wrote: Hi all I need to permit/deny certain SPARUL update operations e.g. deny create| drop graph. I've looked at the UpdateEngineMain and UpdateVisitor classes and was wondering if anyone has extended