On 15 July 2014 08:14, Jorge Gracia <[email protected]> wrote:
> Dear Jim,
>
> Thanks a lot for the  initial feedback! Regarding your comments:
>
>> - the licence (GPL v2 or later) has been omitted. At the very least, the
>> data dump should include COPYING, but dcterms:license should also be
>> included
>
>
> The license can be found in the metadata file [1] and it is stated as GPL3

That is not correct. Actually, as I double check, there is nothing in
apertium-en-es or apertium-en-ca to say 'or later', so there has never
been a version that could have been relicensed as GPL3 by a third
party. But as you go on to say, your work is based on the LMF
conversion, so it's their ability to read a licence that's at fault,
not yours :)

In other conversions, the LMF converters omitted copyright holders,
added others from nowhere, misspelled names... those are just the
problems I remember, I'm sure there were others.

> in the following way (here in RDF/XML format for the EN-ES case, where dcat
> is the prefix for 'dublin core terms' [2] and dcat for the 'data catalog
> vocabulary' [3]):
>
> <dcat:Dataset
> rdf:about="http://linguistic.linkeddata.es/set/apertium/EN-ES";>
>  <dct:license
> rdf:resource="http://purl.oclc.org/NET/rdflicense/gpl-3.0";></dct:license>
>  <dct:source rdf:resource="http://hdl.handle.net/10230/17110";></dct:source>
> </dcat:Dataset>
>
>

I was referring to the dump - none of this is present in the zip file
linked to from the datahub. As this is the only form where a
derivative work is unquestionably distributed, it is the only form
where including the GPL is unquestionably required.

>> - lack of provenance -- _which_ version of apertium-en-es was this derived
>> from? (including a link to an SVN version should be adequate; if it was
>> based on a released version, tell me which and I'll find the URI)
>>
>
> The provenance is also represented in the metadata (see code above) and
> points to the LMF version of Apertium [4]. Further, the RDF describing the
> lexicons and the translation sets also contain the source(s) from which they
> come from. See http://linguistic.linkeddata.es/page/id/apertium/tranSetEN-ES
> for instance.
>

Again, not present in the dump. Unless the dataset is insanely large,
I generally prefer to download it than to throttle someone else's
server :)

>
>> Aside from that, the conversion looks quite lossy (missing senses, missing
>> alternate forms, etc.).
>
>
> Well, the conversion comes from the LMF/XML version and all that is in there
> is also now in RDF  (lexical entries, sense axis, POS, ...) without loss of
> information. The only filtering we did is with regard to the part of speech:
> we converted entries of type adj, adv, n, np, and vblex
>

I'd argue that converting 'np' to lexinfo:noun (instead of
lexinfo:properNoun) _is_ a loss of information[1], but now I know it's
based on the LMF conversion, I know where the loss happened (and,
also, I like to nitpick).

In fact, I'd have defined a bunch of Apertium-specific items, like:

apertium:anthroponym
    a lexinfo:ProperNoun, lexinfo:NounPOS, lexinfo:PartOfSpeech,
owl:NamedIndividual ;
    rdfs:label "anthroponym"@en .

but, IIRC, lexinfo:ProperNoun had too narrow a set of restrictions for
that to work. That said, I don't know OWL particularly well, so
there's probably something obvious I'm missing.

>>
>> If you can tell me which version if came from, I can re-add most of that.
>> I also have decompositions for some of the multiwords, and I think I have
>> some other data to interlink to other lexical datasets (things I did for the
>> first MLODE, but never finished), and maybe between the en-es and en-ca
>> English lexica (they are potentially identical, but I would need to know the
>> version they were derived from).
>>
>
> Anything to make Apertium RDF richer would be welcome!

Well, not using the LMF conversion would be a good start :) Aside from

> Actually I plan to go
> to MLODE'14 [5] with the idea of making some practical work around Apertium
> and linked data. Are you going to attend this time? It would be great to
> meet there!

I doubt it. Usually I have the problem of having money but not time,
or vice versa, but this time I have neither time nor money, so it
seems extremely unlikely :)

-- 
<Sefam> Are any of the mentors around?
<jimregan> yes, they're the ones trolling you

------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to