Absolutely. My thought is about managing one (or more) context ES JSON document(s) where all the @context definitions of an index live. A format plugin can then process search results and converts ES JSON to expanded JSON-LD and from there to other RDF serializations.
Jörg On Fri, Sep 26, 2014 at 6:23 PM, Alfredo Serafini <[email protected]> wrote: > Hi > > using json-ld is indeed rather simple, as it is JSON, and then it's even > possible to index it as is. > I'm currently using ES for storing RDF documents in json-ld on a specific > index: in that case one can simply use the uri as an _id, recover the full > original format by _source, and use basic search capabilities on the index, > if escaping / nesting it's not a big deal. > > However, in order to use resource with some more flexibility, I think the > best would be index them as "flat" as possible, then use an ad-hoc @context > on the ES json to obtain again the original json-ld. > This would be my ideal usage at the moment: seems complex at first, but > it's not, I'm currently experimenting in saving @context for a _type, > obtaining let's say a sort of _context, similar to a _mapping, to > reconstruct the original semantics. > If someone likes the idea, I'd like to share thoughts on that :-) > > > Il giorno venerdì 26 settembre 2014 14:08:07 UTC+2, Jörg Prante ha scritto: >> >> Lukáš, >> >> of course you are right, RDF/XML looks complex and requires parsing. The >> underlying principle of all RDF is a graph (or a series of triples in form >> of subject/predicate/object, where the triple series is a serialization of >> the graph), So the challenge is first the parsing of RDF input, and second, >> constructing the model, and third, serializing the model to an ES-friendly >> input (here: JSON-LD, sort of). RDF ensures that there is a single model >> for all serializations. >> >> This technical perspective does not necessarily solve all challenges that >> are inherent to the chosen data model. For example, nested resources in >> RDF. It might be feasible to flatten nested resource by their identifiers >> and generate one JSON after the other. Or it could be feasible to keep >> nested resources intact and wrap them into nested structures in a single ES >> JSON object. >> >> In my data model, I can map RDF subject IDs to ES doc IDs. Other data >> models may prefer other approaches to select ES doc IDs. >> >> Jörg >> >> >> >> On Fri, Sep 26, 2014 at 10:11 AM, Lukáš Vlček <[email protected]> wrote: >> >>> Jörg, >>> >>> my concern is that RDF/XML allow to express one thing in several ways. >>> For example, if you take FOAF specification then there are several ways how >>> you can express that one Person knows other Person. One way it using >>> reference IDs other way it using nested Person inside other Person. See [1] >>> for examples. My understanding is that although both ways express exactly >>> the same information they lead to different XML representation and thus to >>> different JSON-LD. Not that you can push such data in ES but I wonder if >>> you can then have any consistent way of querying such data. >>> >>> May be there is some way how you can preprocess XML document and convert >>> all nested Persons to references (would require arbitrary ID >>> construction?). Or something similar. Though I am not sure this would be >>> generally applicable approach to any RDF data. >>> >>> [1] http://www.xml.com/pub/a/2004/02/04/foaf.html >>> >>> Regards, >>> Lukas >>> >>> On Fri, Sep 26, 2014 at 9:28 AM, [email protected] <[email protected]> >>> wrote: >>> >>>> JSON-LD is perfect for ES indexing, as long as you use the "compact" >>>> form of representation. >>>> >>>> http://www.w3.org/TR/json-ld-api/#compaction-algorithms >>>> >>>> Example: >>>> >>>> https://github.com/lanthaler/JsonLD/blob/master/Test/ >>>> Fixtures/sample-compacted.jsonld >>>> >>>> This means you should use short field names and shorten IRIs to a >>>> prefix form. This gives a convenient mapping to ES field names (e.g. >>>> "dc:title" or "dc:creator"). The '@' fields can also be indexed and they do >>>> not control anything special in ES (some @id may be mapped to ES _id but >>>> for nested structures this does not match) >>>> >>>> I use my own RDF API and transform RDF graphs (so not only JSON-LD but >>>> also other formats like N-Triples and RDF/XML) into XContent using this >>>> method: >>>> >>>> https://github.com/xbib/xbib/blob/master/content/src/main/ >>>> java/org/xbib/rdf/content/DefaultResourceContentBuilder.java >>>> >>>> I plan to extend this content building by interpreting rdf:type and >>>> rdf:list etc. to generate correct ES JSON objects and arrays. There is also >>>> an amount of work left to do for the plethora of XSD types in RDF literals >>>> or for language tags. >>>> >>>> This will be subsumed into an RDF input/output plugin for an ES-based >>>> Linked Data Platform >>>> >>>> http://www.w3.org/TR/ldp/ >>>> >>>> but there is no ETA yet. >>>> >>>> Jörg >>>> >>>> >>>> On Fri, Sep 26, 2014 at 5:08 AM, Lukáš Vlček <[email protected]> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I think you will have to preprocess documents on your side first and >>>>> then push into ES individually (you can push in batch). >>>>> >>>>> As a side note, I would say json-ld is quite low level serialization >>>>> od RDF data IMO not optimal for ES indexing. May be better would be to >>>>> find >>>>> some RDF-OOM tool and have your RDF documents mapped to Java POJOs and >>>>> serialize POJOs into JSONs instead (you can use Jackson library for that >>>>> for example). This will give you better control over whole RDF -> JSON >>>>> conversion process. >>>>> >>>>> Regards, >>>>> Lukas >>>>> >>>>> On Thu, Sep 25, 2014 at 7:21 PM, abo <[email protected]> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> I'm new to Elasticsearch, so forgive me if this is a basic question >>>>>> or if it's in some documentation that I haven't read... >>>>>> >>>>>> I am trying to load a json-ld file into ES. The json-ld file was >>>>>> generated from an RDF file, using Jena. The structure starts with: >>>>>> >>>>>> { >>>>>> "@graph" : >>>>>> >>>>>> followed by the individual "documents", each with: >>>>>> >>>>>> { >>>>>> "@id" : >>>>>> >>>>>> and a variable number of parameters in each. >>>>>> >>>>>> My question is how do I load this into ES and ensure that documents >>>>>> are individually referenced (as opposed to the entire json-ld file)? >>>>>> >>>>>> Do I need to doctor this json-ld file further in order to load it? >>>>>> >>>>>> Thanks for your help. >>>>>> >>>>>> -- abo >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "elasticsearch" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>>> msgid/elasticsearch/ec26bbe7-5bb1-4c50-96c4-8f586e1e0807% >>>>>> 40googlegroups.com >>>>>> <https://groups.google.com/d/msgid/elasticsearch/ec26bbe7-5bb1-4c50-96c4-8f586e1e0807%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit https://groups.google.com/d/ >>>>> msgid/elasticsearch/CAO9cvUYiqGoP5%3DpYkkhLzP17pLXAPN9sQVY9Oxn7AH >>>>> 4EY10xGA%40mail.gmail.com >>>>> <https://groups.google.com/d/msgid/elasticsearch/CAO9cvUYiqGoP5%3DpYkkhLzP17pLXAPN9sQVY9Oxn7AH4EY10xGA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/elasticsearch/CAKdsXoHtOZmTcm1dYWKHxSfjNN% >>>> 3D%3DqdoVwwvpg3DBEAcJz-xw5A%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHtOZmTcm1dYWKHxSfjNN%3D%3DqdoVwwvpg3DBEAcJz-xw5A%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit https://groups.google.com/d/ >>> msgid/elasticsearch/CAO9cvUZXZNtTAVw1Mhr7N%3D03wo7- >>> L1rKqChja45X7EGTEyc2bw%40mail.gmail.com >>> <https://groups.google.com/d/msgid/elasticsearch/CAO9cvUZXZNtTAVw1Mhr7N%3D03wo7-L1rKqChja45X7EGTEyc2bw%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/25674e99-8767-49be-9e7b-f3d9ae9dffde%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/25674e99-8767-49be-9e7b-f3d9ae9dffde%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFNKDY-uae0P2SRtDehzeEBL4DhXB7uytZrehkXdmjszQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
