Hi, On Wed, Jul 7, 2010 at 9:45 PM, R. Tyler Ballance <[email protected]> wrote: > I've slowly but surely have been fixing the errors as I come across them in > the > file, but it's slow going (I can't figure out how to get dbpedia to ignore > erroneous entries). :(
Which software are you using to process the DBpedia datasets? Could you maybe tell this software to not check if a URI is valid? Cheers, Max >> On Sat, Jun 19, 2010 at 12:24 AM, R. Tyler Ballance <[email protected]> >> wrote: >> > I'm working with 3.5.1, and I've noticed that >> > mappingbased_proopeties_en.nt, >> > compared to the other sets that I've worked with is *full* of errors that >> > break my imports in funky ways. >> > >> > There are a number of non-absolute URLs: >> > >> > ERROR: Malformed document: Not a valid (absolute) URI: >> > www.newfreedomboro.org/index2.htm [line 600646] >> > ERROR: Malformed document: Not a valid (absolute) URI: >> > www.rubenblades.com [line 975491] >> > ERROR: Malformed document: Not a valid (absolute) URI: Fansite [line >> > 1056096] >> > ERROR: Malformed document: Not a valid (absolute) URI: None [line >> > 278162] >> > >> > (Just as a couple examples) >> > >> > As a matter of practice, I've been just dropping malformed entites from the >> > file but I'm wondering if there's anything I can do to track down the >> > errors to >> > help improve the next release? >> > >> > Would filing a ticket with a unified diff of the 3.5.1 >> > mappingbased_proopeties_en.nt file compared to my modified one be helpful? >> > >> > >> > Cheers, >> > -R. Tyler Ballance >> > -------------------------------------- >> > Jabber: [email protected] >> > GitHub: http://github.com/rtyler >> > Identica: http://identi.ca/dero >> > Twitter: http://twitter.com/agentdero >> > Blog: http://unethicalblogger.com >> > >> > >> > ------------------------------------------------------------------------------ >> > ThinkGeek and WIRED's GeekDad team up for the Ultimate >> > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the >> > lucky parental unit. See the prize list and enter to win: >> > http://p.sf.net/sfu/thinkgeek-promo >> > _______________________________________________ >> > Dbpedia-discussion mailing list >> > [email protected] >> > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion >> > >> > > Cheers, > -R. Tyler Ballance > -------------------------------------- > Jabber: [email protected] > GitHub: http://github.com/rtyler > Identica: http://identi.ca/dero > Twitter: http://twitter.com/agentdero > Blog: http://unethicalblogger.com > > ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
