Hi Rob, I have given up working with RDF for the moment and I am experimenting with Neo4J. During this time I notice that they have a LOAD CSV clause that is effectively doing what I suggested in this thread.
http://neo4j.com/blog/neo4j-2-1-graph-etl/ Cheers, Richard On Mon, Mar 17, 2014 at 2:25 PM, Richard Morgan <[email protected]> wrote: > Hi Rob, > > I'll not stop you as you are not oversimplifying things at all. However I > would respectfully disagree that it is an inappropriate thing to do to the > rules engine. > > Fundamentally a rules engine is to construct, transform and enrich a > graph. I would like to load and reason in a single step without involving > lots of libraries and integrations. I personally believe that the world of > Linked Data needs dramatic simplification as the bar to entry is to high. > > The system I used at my previous employer basically did this with a rules > framework that was more close related to SparQL than Jena, which I agree > has the most awful syntax. However it would seem pragmatic to do the same > thing. > > What I actually want to do is write a 'script' in jena rules which can > interrogate a Splunk search head, extract knowledge and instantiate > triples. I could do some ****ing cool things with that. > > I'll look at using SparQL as you suggest, but if I'm calling the endpoint > 10000s of times, once for each set of bindings to create my models I assume > that its going to suffer dreadful performance. > > Cheers, > > Richard > > > > > On Mon, Mar 17, 2014 at 2:12 PM, Dave Reynolds <[email protected]> > wrote: > >> On 17/03/14 12:26, [email protected] wrote: >> >>> Hi Dave, >>> >>> >>> That is an enormous shame. This is a methodology I've worked with in a >>> different library and it makes a very simple way to instantiate complex >>> structures from tabular structures. >>> >>> >>> For instance consider this pseudo code below, it reads a three column >>> CSV file, then it creates URIs for objects and combines these URIs and >>> attributes in the forward chaining bit. This both loads and denormalizes >>> the serialized object structure back into a graph / tree. >>> >>> >>> [loadable: >>> >>> >>> Load(“/mytable.csv”, ?customerName, ?customerAccount, ?service) >>> >>> makeURI(?custObj, ns:, ?customerAccount ) >>> >>> >>> makeURI(?serviceObj, ns:, ?service) >>> >>> -> >>> >>> >>> (?custObj a ns:CustomerAccount) >>> >>> (?custObj ns:customerName ?customerName) >>> >>> (?custObj ns:customerAccountId, ?customerAccount) >>> >>> (?custObj ns:customerHasService ?serviceObj) >>> >>> (?serviceObj a ns:Service) >>> >>> (?serviceObj ns:serviceId ?service) >>> >>> ] >>> >>> >>> I am sure you can see that this method can be applied to lots of >>> different scenarios and is a very simple way to load bindings and >>> immediately create a graph from those bindings. >>> >>> >>> Do you know of other libraries that might support such operations? >>> >> >> Seems like all you need to do here is instantiate some triple patterns >> based on bindings from a data source. >> >> That's pretty straightforward to do directly. I'm not sure the rules >> engine would really help you much, you wouldn't be using rule chaining in >> any case and the syntax for the rules is hardly friendly :) >> >> I'd be inclined to just use a template instantiation approach. The >> templates could be e.g. SPARQL update or some custom syntax. >> >> Typically the only tricky bit of template-based approaches is all the >> processing builtins that you want for string mangling etc. I've typically >> used embedded ruby or javascript for that in the past though have recently >> [1] have used the lighter-weight Apache jexl [2] with some success. >> >> Dave >> >> [1] https://github.com/epimorphics/dclib/wiki >> [2] http://commons.apache.org/proper/commons-jexl/ >> >> >> >> >>> >>> Cheers, >>> >>> >>> Richard >>> >>> >>> >>> >>> >>> >>> Sent from Surface Pro >>> >>> >>> >>> >>> >>> From: Dave Reynolds >>> Sent: Monday, March 17, 2014 12:13 PM >>> >>> To: [email protected] >>> >>> >>> >>> >>> >>> On 17/03/14 11:44, Richard Morgan wrote: >>> >>>> Hi Dave, >>>> >>>> Thank you for your response, I'm glad to have my thoughts confirmed. Is >>>> it >>>> possible to write my own generators and register them like I have with >>>> builtins? >>>> >>> >>> No, sorry. >>> >>> For the forward rule system there's simply no equivalent notion. >>> >>> For the backward rules there is the notion of generators but they aren't >>> designed as an extension point (far from it). >>> >>> The problem I want to solve isn't the regex example above, its more >>>> about >>>> generating bindings so I can feed them into a forward rule and then >>>> instantiate triples as a general pattern. >>>> >>> >>> Hard. >>> >>> You can write builtins which assert information directly into the >>> deductions graph which can generate as many triples as you want. That's >>> relatively easy and safe. However, it bypasses all the rule machinery >>> and means that other rules don't see the results and you don't get to >>> instantiate more patterns. >>> >>> It might just be possible to write a builtin which would directly call >>> the rule engine to add a rule firing to the conflict set >>> (RETEEngine.requestRuleFiring) and pass in a series of different >>> manufacturing binding environments to each firing request. >>> >>> However, I've never tried anything like that and prodding the underlying >>> engine mechanics from within a builtin is not guaranteed to be safe! >>> >>> Dave >>> >>> >>>> Cheers, >>>> >>>> Richard >>>> >>>> >>>> On Mon, Mar 17, 2014 at 9:06 AM, Dave Reynolds < >>>> [email protected]>wrote: >>>> >>>> On 14/03/14 13:53, Richard Morgan wrote: >>>>> >>>>> Hi, >>>>>> >>>>>> I would like to extend the base regex function in Jena to provide more >>>>>> than >>>>>> one match result. >>>>>> >>>>>> >>>>> I don't think that's possible. >>>>> >>>>> >>>>> For instance I would like the following rule >>>>> >>>>>> >>>>>> [ myregex("the cat sat on the mat", \"(.at)\", ?token) >>>>>> >>>>>> " -> (<http://a> <http://b> ?token)]"; >>>>>> >>>>>> to return >>>>>> >>>>>> - [http://a, http://b, "cat"] >>>>>> >>>>>> - [http://a, http://b, "sat"] >>>>>> >>>>>> - [http://a, http://b, "mat"] >>>>>> >>>>>> From looking at how BindingEnvironment works I can only return >>>>>> with a >>>>>> single binding per variable. >>>>>> >>>>>> >>>>> Correct. >>>>> >>>>> In Jena rules then builtins are only used as essentially filters on >>>>> rule >>>>> firings, they aren't generators. >>>>> >>>>> In the forward rule case (which is suggested by your notation above) >>>>> that >>>>> wouldn't make sense anyway - forward rules either fire or they don't, >>>>> there's no backtracking. >>>>> >>>>> In the backward rule case then there is backtracking but the interface >>>>> for >>>>> builtins doesn't support their use as generators. >>>>> >>>>> Dave >>>>> >>>>> >>>>> >>>> >>> >> >
