Hi Rob,

I have given up working with RDF for the moment and I am experimenting with
Neo4J. During this time I notice that they have a LOAD CSV clause that is
effectively doing what I suggested in this thread.

http://neo4j.com/blog/neo4j-2-1-graph-etl/

Cheers,

Richard


On Mon, Mar 17, 2014 at 2:25 PM, Richard Morgan <[email protected]> wrote:

> Hi Rob,
>
> I'll not stop you as you are not oversimplifying things at all. However I
> would respectfully disagree that it is an inappropriate thing to do to the
> rules engine.
>
> Fundamentally a rules engine is to construct, transform and enrich a
> graph. I would like to load and reason in a single step without involving
> lots of libraries and integrations. I personally believe that the world of
> Linked Data needs dramatic simplification as the bar to entry is to high.
>
> The system I used at my previous employer basically did this with a rules
> framework that was more close related to SparQL than Jena, which I agree
> has the most awful syntax. However it would seem pragmatic to do the same
> thing.
>
> What I actually want to do is write a 'script' in jena rules which can
> interrogate a Splunk search head, extract knowledge and instantiate
> triples. I could do some ****ing cool things with that.
>
> I'll look at using SparQL as you suggest, but if I'm calling the endpoint
> 10000s of times, once for each set of bindings to create my models I assume
> that its going to suffer dreadful performance.
>
> Cheers,
>
> Richard
>
>
>
>
> On Mon, Mar 17, 2014 at 2:12 PM, Dave Reynolds <[email protected]>
> wrote:
>
>> On 17/03/14 12:26, [email protected] wrote:
>>
>>> Hi Dave,
>>>
>>>
>>> That is an enormous shame. This is a methodology I've worked with in a
>>> different library and it makes a very simple way to instantiate complex
>>> structures from tabular structures.
>>>
>>>
>>> For instance consider this pseudo code below, it reads a three column
>>> CSV file, then it creates URIs for objects and combines these URIs and
>>> attributes in the forward chaining bit. This both loads and denormalizes
>>> the serialized object structure back into a graph / tree.
>>>
>>>
>>> [loadable:
>>>
>>>
>>> Load(“/mytable.csv”, ?customerName, ?customerAccount, ?service)
>>>
>>> makeURI(?custObj, ns:, ?customerAccount )
>>>
>>>
>>> makeURI(?serviceObj, ns:, ?service)
>>>
>>> ->
>>>
>>>
>>> (?custObj a ns:CustomerAccount)
>>>
>>> (?custObj ns:customerName ?customerName)
>>>
>>> (?custObj ns:customerAccountId, ?customerAccount)
>>>
>>> (?custObj ns:customerHasService ?serviceObj)
>>>
>>> (?serviceObj a ns:Service)
>>>
>>> (?serviceObj ns:serviceId ?service)
>>>
>>> ]
>>>
>>>
>>> I am sure you can see that this method can be applied to lots of
>>> different scenarios and is a very simple way to load bindings and
>>> immediately create a graph from those bindings.
>>>
>>>
>>> Do you know of other libraries that might support such operations?
>>>
>>
>> Seems like all you need to do here is instantiate some triple patterns
>> based on bindings from a data source.
>>
>> That's pretty straightforward to do directly. I'm not sure the rules
>> engine would really help you much, you wouldn't be using rule chaining in
>> any case and the syntax for the rules is hardly friendly :)
>>
>> I'd be inclined to just use a template instantiation approach. The
>> templates could be e.g. SPARQL update or some custom syntax.
>>
>> Typically the only tricky bit of template-based approaches is all the
>> processing builtins that you want for string mangling etc. I've typically
>> used embedded ruby or javascript for that in the past though have recently
>> [1] have used the lighter-weight Apache jexl [2] with some success.
>>
>> Dave
>>
>> [1] https://github.com/epimorphics/dclib/wiki
>> [2] http://commons.apache.org/proper/commons-jexl/
>>
>>
>>
>>
>>>
>>> Cheers,
>>>
>>>
>>> Richard
>>>
>>>
>>>
>>>
>>>
>>>
>>> Sent from Surface Pro
>>>
>>>
>>>
>>>
>>>
>>> From: Dave Reynolds
>>> Sent: Monday, March 17, 2014 12:13 PM
>>>
>>> To: [email protected]
>>>
>>>
>>>
>>>
>>>
>>> On 17/03/14 11:44, Richard Morgan wrote:
>>>
>>>> Hi Dave,
>>>>
>>>> Thank you for your response, I'm glad to have my thoughts confirmed. Is
>>>> it
>>>> possible to write my own generators and register them like I have with
>>>> builtins?
>>>>
>>>
>>> No, sorry.
>>>
>>> For the forward rule system there's simply no equivalent notion.
>>>
>>> For the backward rules there is the notion of generators but they aren't
>>> designed as an extension point (far from it).
>>>
>>>  The problem I want to solve isn't the regex example above, its more
>>>> about
>>>> generating bindings so I can feed them into a forward rule and then
>>>> instantiate triples as a general pattern.
>>>>
>>>
>>> Hard.
>>>
>>> You can write builtins which assert information directly into the
>>> deductions graph which can generate as many triples as you want. That's
>>> relatively easy and safe. However, it bypasses all the rule machinery
>>> and means that other rules don't see the results and you don't get to
>>> instantiate more patterns.
>>>
>>> It might just be possible to write a builtin which would directly call
>>> the rule engine to add a rule firing to the conflict set
>>> (RETEEngine.requestRuleFiring) and pass in a series of different
>>> manufacturing binding environments to each firing request.
>>>
>>> However, I've never tried anything like that and prodding the underlying
>>> engine mechanics from within a builtin is not guaranteed to be safe!
>>>
>>> Dave
>>>
>>>
>>>> Cheers,
>>>>
>>>> Richard
>>>>
>>>>
>>>> On Mon, Mar 17, 2014 at 9:06 AM, Dave Reynolds <
>>>> [email protected]>wrote:
>>>>
>>>>  On 14/03/14 13:53, Richard Morgan wrote:
>>>>>
>>>>>  Hi,
>>>>>>
>>>>>> I would like to extend the base regex function in Jena to provide more
>>>>>> than
>>>>>> one match result.
>>>>>>
>>>>>>
>>>>> I don't think that's possible.
>>>>>
>>>>>
>>>>>    For instance I would like the following rule
>>>>>
>>>>>>
>>>>>>      [ myregex("the cat sat on the mat", \"(.at)\", ?token)
>>>>>>
>>>>>>         " -> (<http://a> <http://b> ?token)]";
>>>>>>
>>>>>> to return
>>>>>>
>>>>>>     - [http://a, http://b, "cat"]
>>>>>>
>>>>>>     - [http://a, http://b, "sat"]
>>>>>>
>>>>>>     - [http://a, http://b, "mat"]
>>>>>>
>>>>>>    From looking at how BindingEnvironment works I can only return
>>>>>> with a
>>>>>> single binding per variable.
>>>>>>
>>>>>>
>>>>> Correct.
>>>>>
>>>>> In Jena rules then builtins are only used as essentially filters on
>>>>> rule
>>>>> firings, they aren't generators.
>>>>>
>>>>> In the forward rule case (which is suggested by your notation above)
>>>>> that
>>>>> wouldn't make sense anyway - forward rules either fire or they don't,
>>>>> there's no backtracking.
>>>>>
>>>>> In the backward rule case then there is backtracking but the interface
>>>>> for
>>>>> builtins doesn't support their use as generators.
>>>>>
>>>>> Dave
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to