On 15 July 2014 08:14, Jorge Gracia <[email protected]> wrote: > Dear Jim, > > Thanks a lot for the initial feedback! Regarding your comments: > >> - the licence (GPL v2 or later) has been omitted. At the very least, the >> data dump should include COPYING, but dcterms:license should also be >> included > > > The license can be found in the metadata file [1] and it is stated as GPL3
That is not correct. Actually, as I double check, there is nothing in apertium-en-es or apertium-en-ca to say 'or later', so there has never been a version that could have been relicensed as GPL3 by a third party. But as you go on to say, your work is based on the LMF conversion, so it's their ability to read a licence that's at fault, not yours :) In other conversions, the LMF converters omitted copyright holders, added others from nowhere, misspelled names... those are just the problems I remember, I'm sure there were others. > in the following way (here in RDF/XML format for the EN-ES case, where dcat > is the prefix for 'dublin core terms' [2] and dcat for the 'data catalog > vocabulary' [3]): > > <dcat:Dataset > rdf:about="http://linguistic.linkeddata.es/set/apertium/EN-ES"> > <dct:license > rdf:resource="http://purl.oclc.org/NET/rdflicense/gpl-3.0"></dct:license> > <dct:source rdf:resource="http://hdl.handle.net/10230/17110"></dct:source> > </dcat:Dataset> > > I was referring to the dump - none of this is present in the zip file linked to from the datahub. As this is the only form where a derivative work is unquestionably distributed, it is the only form where including the GPL is unquestionably required. >> - lack of provenance -- _which_ version of apertium-en-es was this derived >> from? (including a link to an SVN version should be adequate; if it was >> based on a released version, tell me which and I'll find the URI) >> > > The provenance is also represented in the metadata (see code above) and > points to the LMF version of Apertium [4]. Further, the RDF describing the > lexicons and the translation sets also contain the source(s) from which they > come from. See http://linguistic.linkeddata.es/page/id/apertium/tranSetEN-ES > for instance. > Again, not present in the dump. Unless the dataset is insanely large, I generally prefer to download it than to throttle someone else's server :) > >> Aside from that, the conversion looks quite lossy (missing senses, missing >> alternate forms, etc.). > > > Well, the conversion comes from the LMF/XML version and all that is in there > is also now in RDF (lexical entries, sense axis, POS, ...) without loss of > information. The only filtering we did is with regard to the part of speech: > we converted entries of type adj, adv, n, np, and vblex > I'd argue that converting 'np' to lexinfo:noun (instead of lexinfo:properNoun) _is_ a loss of information[1], but now I know it's based on the LMF conversion, I know where the loss happened (and, also, I like to nitpick). In fact, I'd have defined a bunch of Apertium-specific items, like: apertium:anthroponym a lexinfo:ProperNoun, lexinfo:NounPOS, lexinfo:PartOfSpeech, owl:NamedIndividual ; rdfs:label "anthroponym"@en . but, IIRC, lexinfo:ProperNoun had too narrow a set of restrictions for that to work. That said, I don't know OWL particularly well, so there's probably something obvious I'm missing. >> >> If you can tell me which version if came from, I can re-add most of that. >> I also have decompositions for some of the multiwords, and I think I have >> some other data to interlink to other lexical datasets (things I did for the >> first MLODE, but never finished), and maybe between the en-es and en-ca >> English lexica (they are potentially identical, but I would need to know the >> version they were derived from). >> > > Anything to make Apertium RDF richer would be welcome! Well, not using the LMF conversion would be a good start :) Aside from > Actually I plan to go > to MLODE'14 [5] with the idea of making some practical work around Apertium > and linked data. Are you going to attend this time? It would be great to > meet there! I doubt it. Usually I have the problem of having money but not time, or vice versa, but this time I have neither time nor money, so it seems extremely unlikely :) -- <Sefam> Are any of the mentors around? <jimregan> yes, they're the ones trolling you ------------------------------------------------------------------------------ Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds _______________________________________________ Apertium-stuff mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/apertium-stuff
