hi,
maybe this is a silly question, or you have already talked it over in the
past (but i didn.t find anything searching)
wouldn't it be more accurate (and easier to extend it in more languages) if
instead of creating a regex to find "non-free" in the text
to create a regex to catch the free ones using the (free) copyright codes
e.g. {{PD-Art}}, {{PD-US}}
this way "false-positives" (non-free images passed as free) will be
eliminated
and it could be easier to extend it in more languages, by creating a regex
for each language (by replacing with language specific copyright codes)
or another way
to create some mappings for all copyright codes, and tag them as
free/non-free
Thanks,
Jim
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit. See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion