Lukáš, of course you are right, RDF/XML looks complex and requires parsing. The underlying principle of all RDF is a graph (or a series of triples in form of subject/predicate/object, where the triple series is a serialization of the graph), So the challenge is first the parsing of RDF input, and second, constructing the model, and third, serializing the model to an ES-friendly input (here: JSON-LD, sort of). RDF ensures that there is a single model for all serializations.
This technical perspective does not necessarily solve all challenges that are inherent to the chosen data model. For example, nested resources in RDF. It might be feasible to flatten nested resource by their identifiers and generate one JSON after the other. Or it could be feasible to keep nested resources intact and wrap them into nested structures in a single ES JSON object. In my data model, I can map RDF subject IDs to ES doc IDs. Other data models may prefer other approaches to select ES doc IDs. Jörg On Fri, Sep 26, 2014 at 10:11 AM, Lukáš Vlček <[email protected]> wrote: > Jörg, > > my concern is that RDF/XML allow to express one thing in several ways. For > example, if you take FOAF specification then there are several ways how you > can express that one Person knows other Person. One way it using reference > IDs other way it using nested Person inside other Person. See [1] for > examples. My understanding is that although both ways express exactly the > same information they lead to different XML representation and thus to > different JSON-LD. Not that you can push such data in ES but I wonder if > you can then have any consistent way of querying such data. > > May be there is some way how you can preprocess XML document and convert > all nested Persons to references (would require arbitrary ID > construction?). Or something similar. Though I am not sure this would be > generally applicable approach to any RDF data. > > [1] http://www.xml.com/pub/a/2004/02/04/foaf.html > > Regards, > Lukas > > On Fri, Sep 26, 2014 at 9:28 AM, [email protected] < > [email protected]> wrote: > >> JSON-LD is perfect for ES indexing, as long as you use the "compact" form >> of representation. >> >> http://www.w3.org/TR/json-ld-api/#compaction-algorithms >> >> Example: >> >> >> https://github.com/lanthaler/JsonLD/blob/master/Test/Fixtures/sample-compacted.jsonld >> >> This means you should use short field names and shorten IRIs to a prefix >> form. This gives a convenient mapping to ES field names (e.g. "dc:title" or >> "dc:creator"). The '@' fields can also be indexed and they do not control >> anything special in ES (some @id may be mapped to ES _id but for nested >> structures this does not match) >> >> I use my own RDF API and transform RDF graphs (so not only JSON-LD but >> also other formats like N-Triples and RDF/XML) into XContent using this >> method: >> >> >> https://github.com/xbib/xbib/blob/master/content/src/main/java/org/xbib/rdf/content/DefaultResourceContentBuilder.java >> >> I plan to extend this content building by interpreting rdf:type and >> rdf:list etc. to generate correct ES JSON objects and arrays. There is also >> an amount of work left to do for the plethora of XSD types in RDF literals >> or for language tags. >> >> This will be subsumed into an RDF input/output plugin for an ES-based >> Linked Data Platform >> >> http://www.w3.org/TR/ldp/ >> >> but there is no ETA yet. >> >> Jörg >> >> >> On Fri, Sep 26, 2014 at 5:08 AM, Lukáš Vlček <[email protected]> >> wrote: >> >>> Hi, >>> >>> I think you will have to preprocess documents on your side first and >>> then push into ES individually (you can push in batch). >>> >>> As a side note, I would say json-ld is quite low level serialization od >>> RDF data IMO not optimal for ES indexing. May be better would be to find >>> some RDF-OOM tool and have your RDF documents mapped to Java POJOs and >>> serialize POJOs into JSONs instead (you can use Jackson library for that >>> for example). This will give you better control over whole RDF -> JSON >>> conversion process. >>> >>> Regards, >>> Lukas >>> >>> On Thu, Sep 25, 2014 at 7:21 PM, abo <[email protected]> wrote: >>> >>>> Hello, >>>> >>>> I'm new to Elasticsearch, so forgive me if this is a basic question or >>>> if it's in some documentation that I haven't read... >>>> >>>> I am trying to load a json-ld file into ES. The json-ld file was >>>> generated from an RDF file, using Jena. The structure starts with: >>>> >>>> { >>>> "@graph" : >>>> >>>> followed by the individual "documents", each with: >>>> >>>> { >>>> "@id" : >>>> >>>> and a variable number of parameters in each. >>>> >>>> My question is how do I load this into ES and ensure that documents are >>>> individually referenced (as opposed to the entire json-ld file)? >>>> >>>> Do I need to doctor this json-ld file further in order to load it? >>>> >>>> Thanks for your help. >>>> >>>> -- abo >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/elasticsearch/ec26bbe7-5bb1-4c50-96c4-8f586e1e0807%40googlegroups.com >>>> <https://groups.google.com/d/msgid/elasticsearch/ec26bbe7-5bb1-4c50-96c4-8f586e1e0807%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/CAO9cvUYiqGoP5%3DpYkkhLzP17pLXAPN9sQVY9Oxn7AH4EY10xGA%40mail.gmail.com >>> <https://groups.google.com/d/msgid/elasticsearch/CAO9cvUYiqGoP5%3DpYkkhLzP17pLXAPN9sQVY9Oxn7AH4EY10xGA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHtOZmTcm1dYWKHxSfjNN%3D%3DqdoVwwvpg3DBEAcJz-xw5A%40mail.gmail.com >> <https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHtOZmTcm1dYWKHxSfjNN%3D%3DqdoVwwvpg3DBEAcJz-xw5A%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAO9cvUZXZNtTAVw1Mhr7N%3D03wo7-L1rKqChja45X7EGTEyc2bw%40mail.gmail.com > <https://groups.google.com/d/msgid/elasticsearch/CAO9cvUZXZNtTAVw1Mhr7N%3D03wo7-L1rKqChja45X7EGTEyc2bw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGWEaadvoAJWmwDeKqb9pVsYNjS6GAzozVXgYWr4LgXUg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
