Re: Loading JSON-LD into ES

[email protected] Fri, 26 Sep 2014 11:33:08 -0700

Absolutely. My thought is about managing one (or more) context ES JSON
document(s) where all the @context definitions of an index live. A format
plugin can then process search results and converts ES JSON to expanded
JSON-LD and from there to other RDF serializations.


Jörg

On Fri, Sep 26, 2014 at 6:23 PM, Alfredo Serafini <[email protected]> wrote:

> Hi
>
> using json-ld is indeed rather simple, as it is JSON, and then it's even
> possible to index it as is.
> I'm currently using ES for storing RDF documents in json-ld on a specific
> index: in that case one can simply use the uri as an _id, recover the full
> original format by _source, and use basic search capabilities on the index,
> if escaping / nesting it's not a big deal.
>
> However, in order to use resource with some more flexibility, I think the
> best would be index them as "flat" as possible, then use an ad-hoc @context
> on the ES json to obtain again the original json-ld.
> This would be my ideal usage at the moment: seems complex at first, but
> it's not, I'm currently experimenting in saving @context for a _type,
> obtaining let's say a sort of _context, similar to a _mapping, to
> reconstruct the original semantics.
> If someone likes the idea, I'd like to share thoughts on that :-)
>
>
> Il giorno venerdì 26 settembre 2014 14:08:07 UTC+2, Jörg Prante ha scritto:
>>
>> Lukáš,
>>
>> of course you are right, RDF/XML looks complex and requires parsing. The
>> underlying principle of all RDF is a graph (or a series of triples in form
>> of subject/predicate/object, where the triple series is a serialization of
>> the graph), So the challenge is first the parsing of RDF input, and second,
>> constructing the model, and third, serializing the model to an ES-friendly
>> input (here: JSON-LD, sort of). RDF ensures that there is a single model
>> for all serializations.
>>
>> This technical perspective does not necessarily solve all challenges that
>> are inherent to the chosen data model. For example, nested resources in
>> RDF. It might be feasible to flatten nested resource by their identifiers
>> and generate one JSON after the other. Or it could be feasible to keep
>> nested resources intact and wrap them into nested structures in a single ES
>> JSON object.
>>
>> In my data model, I can map RDF subject IDs to ES doc IDs. Other data
>> models may prefer other approaches to select ES doc IDs.
>>
>> Jörg
>>
>>
>>
>> On Fri, Sep 26, 2014 at 10:11 AM, Lukáš Vlček <[email protected]> wrote:
>>
>>> Jörg,
>>>
>>> my concern is that RDF/XML allow to express one thing in several ways.
>>> For example, if you take FOAF specification then there are several ways how
>>> you can express that one Person knows other Person. One way it using
>>> reference IDs other way it using nested Person inside other Person. See [1]
>>> for examples. My understanding is that although both ways express exactly
>>> the same information they lead to different XML representation and thus to
>>> different JSON-LD. Not that you can push such data in ES but I wonder if
>>> you can then have any consistent way of querying such data.
>>>
>>> May be there is some way how you can preprocess XML document and convert
>>> all nested Persons to references (would require arbitrary ID
>>> construction?). Or something similar. Though I am not sure this would be
>>> generally applicable approach to any RDF data.
>>>
>>> [1] http://www.xml.com/pub/a/2004/02/04/foaf.html
>>>
>>> Regards,
>>> Lukas
>>>
>>> On Fri, Sep 26, 2014 at 9:28 AM, [email protected] <[email protected]>
>>> wrote:
>>>
>>>> JSON-LD is perfect for ES indexing, as long as you use the "compact"
>>>> form of representation.
>>>>
>>>> http://www.w3.org/TR/json-ld-api/#compaction-algorithms
>>>>
>>>> Example:
>>>>
>>>> https://github.com/lanthaler/JsonLD/blob/master/Test/
>>>> Fixtures/sample-compacted.jsonld
>>>>
>>>> This means you should use short field names and shorten IRIs to a
>>>> prefix form. This gives a convenient mapping to ES field names (e.g.
>>>> "dc:title" or "dc:creator"). The '@' fields can also be indexed and they do
>>>> not control anything special in ES (some @id may be mapped to ES _id but
>>>> for nested structures this does not match)
>>>>
>>>> I use my own RDF API and transform RDF graphs (so not only JSON-LD but
>>>> also other formats like N-Triples and RDF/XML) into XContent using this
>>>> method:
>>>>
>>>> https://github.com/xbib/xbib/blob/master/content/src/main/
>>>> java/org/xbib/rdf/content/DefaultResourceContentBuilder.java
>>>>
>>>> I plan to extend this content building by interpreting rdf:type and
>>>> rdf:list etc. to generate correct ES JSON objects and arrays. There is also
>>>> an amount of work left to do for the plethora of XSD types in RDF literals
>>>> or for language tags.
>>>>
>>>> This will be subsumed into an RDF input/output plugin for an ES-based
>>>> Linked Data Platform
>>>>
>>>> http://www.w3.org/TR/ldp/
>>>>
>>>> but there is no ETA yet.
>>>>
>>>> Jörg
>>>>
>>>>
>>>> On Fri, Sep 26, 2014 at 5:08 AM, Lukáš Vlček <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I think you will have to preprocess documents on your side first and
>>>>> then push into ES individually (you can push in batch).
>>>>>
>>>>> As a side note, I would say json-ld is quite low level serialization
>>>>> od RDF data IMO not optimal for ES indexing. May be better would be to 
>>>>> find
>>>>> some RDF-OOM tool and have your RDF documents mapped to Java POJOs and
>>>>> serialize POJOs into JSONs instead (you can use Jackson library for that
>>>>> for example). This will give you better control over whole RDF -> JSON
>>>>> conversion process.
>>>>>
>>>>> Regards,
>>>>> Lukas
>>>>>
>>>>> On Thu, Sep 25, 2014 at 7:21 PM, abo <[email protected]> wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I'm new to Elasticsearch, so forgive me if this is a basic question
>>>>>> or if it's in some documentation that I haven't read...
>>>>>>
>>>>>> I am trying to load a json-ld file into ES. The json-ld file was
>>>>>> generated from an RDF file, using Jena. The structure starts with:
>>>>>>
>>>>>> {
>>>>>>   "@graph" :
>>>>>>
>>>>>> followed by the individual "documents", each with:
>>>>>>
>>>>>> {
>>>>>>     "@id" :
>>>>>>
>>>>>> and a variable number of parameters in each.
>>>>>>
>>>>>> My question is how do I load this into ES and ensure that documents
>>>>>> are individually referenced (as opposed to the entire json-ld file)?
>>>>>>
>>>>>> Do I need to doctor this json-ld file further in order to load it?
>>>>>>
>>>>>> Thanks for your help.
>>>>>>
>>>>>> -- abo
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "elasticsearch" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>>> msgid/elasticsearch/ec26bbe7-5bb1-4c50-96c4-8f586e1e0807%
>>>>>> 40googlegroups.com
>>>>>> <https://groups.google.com/d/msgid/elasticsearch/ec26bbe7-5bb1-4c50-96c4-8f586e1e0807%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>  --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "elasticsearch" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>> msgid/elasticsearch/CAO9cvUYiqGoP5%3DpYkkhLzP17pLXAPN9sQVY9Oxn7AH
>>>>> 4EY10xGA%40mail.gmail.com
>>>>> <https://groups.google.com/d/msgid/elasticsearch/CAO9cvUYiqGoP5%3DpYkkhLzP17pLXAPN9sQVY9Oxn7AH4EY10xGA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>> msgid/elasticsearch/CAKdsXoHtOZmTcm1dYWKHxSfjNN%
>>>> 3D%3DqdoVwwvpg3DBEAcJz-xw5A%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHtOZmTcm1dYWKHxSfjNN%3D%3DqdoVwwvpg3DBEAcJz-xw5A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/CAO9cvUZXZNtTAVw1Mhr7N%3D03wo7-
>>> L1rKqChja45X7EGTEyc2bw%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/elasticsearch/CAO9cvUZXZNtTAVw1Mhr7N%3D03wo7-L1rKqChja45X7EGTEyc2bw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/25674e99-8767-49be-9e7b-f3d9ae9dffde%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/25674e99-8767-49be-9e7b-f3d9ae9dffde%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFNKDY-uae0P2SRtDehzeEBL4DhXB7uytZrehkXdmjszQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Loading JSON-LD into ES

Reply via email to