Hi Tommy, On 12/07/2011 12:13 AM, Tommy Chheng wrote: > This solution is also flawed. Check with Batman_Kane.jpg > > I recommend using org.apache.commons.codec.digest.DigestUtils#md5Hex > Relying on a commonly used library is a lot less bug prone. > > > On Mon, Dec 5, 2011 at 4:17 PM, Tommy Chheng<[email protected]> wrote: >> Thanks to the folks on the wikipedia api mailing list, the problem was >> that the leading zero was being eaten.
I've already spotted the problem you mentioned and fixed it in our live instance which is available at "http://live.dbpedia.org/sparql". During its run DBpedia-Live fixes more articles as it encounters them, so you will not find foaf:depiction predicate for all articles, but by time more and more will have their corresponding foaf:depiction predicates. We will include that fix also in the next release of DBpedia. Please, have a look on it and send me any feedback you have about it. >> This will fix it in ImageExtractor#getImageUrl: >> >> val result = (new BigInteger(1, messageDigest)).toString(16) >> val md5 = if (result.length % 2 != 0) "0" + result else result >> >> I would submit a patch but i'm unsure how to do so. >> >> >> On Sat, Dec 3, 2011 at 6:38 PM, Tommy Chheng<[email protected]> wrote: >>> I'm using ImageExtractor#getImageUrl in the extraction_framework to >>> get the url of an image. >>> >>> val md = MessageDigest.getInstance("MD5") >>> val messageDigest = md.digest(fileName.getBytes) >>> val md5 = (new BigInteger(1, messageDigest)).toString(16) >>> >>> val hash1 = md5.substring(0, 1) >>> val hash2 = md5.substring(0, 2); >>> >>> val urlPart = hash1 + "/" + hash2 + "/" + fileName >>> >>> Most of the time, the function works correctly but on a few cases, it >>> is incorrect: >>> >>> For "Stewie_Griffin.png", I get 2/26/Stewie_Griffin.png but the real >>> one is 0/02/Stewie_Griffin.png >>> >>> The source file info is here: >>> http://en.wikipedia.org/wiki/File:Stewie_Griffin.png >>> http://upload.wikimedia.org/wikipedia/en/0/02/Stewie_Griffin.png >>> >>> Any ideas why the hashing scheme doesn't work sometimes? >>> >>> -- >>> @tommychheng >>> http://tommy.chheng.com >> >> >> -- >> @tommychheng >> http://tommy.chheng.com > > -- Kind Regards Mohamed Morsey Department of Computer Science University of Leipzig ------------------------------------------------------------------------------ Ridiculously easy VDI. With Citrix VDI-in-a-Box, you don't need a complex infrastructure or vast IT resources to deliver seamless, secure access to virtual desktops. With this all-in-one solution, easily deploy virtual desktops for less than the cost of PCs and save 60% on VDI infrastructure costs. Try it free! http://p.sf.net/sfu/Citrix-VDIinabox _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
