Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
On Tue, Nov 3, 2009 at 03:06, David Reyes Samblas Martinez da...@tuxbrain.com wrote: An now some numbers the average image are 10kb so with the hypotesis than there are one image per article (yes I know there articles with more than a image but there a lot of articles without images) there will be about 3.000.000 images so 30Gb of images :P there are any 32Gb uSD cards out there? Can we run zlib and, wait-wait... libpng on the device? :) -- Alex ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
hi, the wish is clear:), but the images on wikipedia are slightly problematic, see http://en.wikipedia.org/wiki/Wikipedia:Database_download#Images_and_uploaded_files and http://en.wikipedia.org/wiki/Wikipedia:Copyrights#Non-free_materials_and_special_requirements on the other hand, see:): http://meta.wikimedia.org/wiki/Wikix -Jörn David Reyes Samblas Martinez wrote: 2009/10/31 Sean Moss-Pultz s...@openmoko.com: On Fri, Oct 30, 2009 at 11:29 PM, Laszlo KREKACS laszlo.krekacs.l...@gmail.com wrote: On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez da...@tuxbrain.com wrote: Are you uploading this changes to git? can I take a look? Btw is there any plan to implement images rendering? Math (images) are on our roadmap. Hopefully before the end of this year. The screen is only 1bit. So anything else would look kinda funny. -Sean Well due I have clear than Internationalization and running other apps are totally posible and in fact it can be done without much hacking, I have spend some time investigating the posibility of include image other than maths on the device and I think is at least more closer than it can seems. I have find a process[1] I think it can be industrialized to transform any image of the wikipedia to one more or less good to the device is clear than we can expect a real time 3D zoomable render on the WR but I think results are quite promising Just some questions, is hard to do a image viewer able to scroll vertically as we do in text? Any good tutorial of scripting using gimp? An now some numbers the average image are 10kb so with the hypotesis than there are one image per article (yes I know there articles with more than a image but there a lot of articles without images) there will be about 3.000.000 images so 30Gb of images :P there are any 32Gb uSD cards out there? I have to do a more in depth analisis on how many images(meaningful) are there using the data on the dumps of wikipedia so we will see. [1]http://www.tuxbrain.com/en/content/images-wikireader-posible ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community smime.p7s Description: S/MIME Cryptographic Signature ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
David Samblas Martinez wrote: 2009/10/31 Sean Moss-Pultz s...@openmoko.com: On Fri, Oct 30, 2009 at 11:29 PM, Laszlo KREKACS laszlo.krekacs.l...@gmail.com wrote: On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez da...@tuxbrain.com wrote: Are you uploading this changes to git? can I take a look? Btw is there any plan to implement images rendering? Math (images) are on our roadmap. Hopefully before the end of this year. The screen is only 1bit. So anything else would look kinda funny. -Sean Well due I have clear than Internationalization and running other apps are totally posible and in fact it can be done without much hacking, I have spend some time investigating the posibility of include image other than maths on the device and I think is at least more closer than it can seems. I have find a process[1] I think it can be industrialized to transform any image of the wikipedia to one more or less good to the device is clear than we can expect a real time 3D zoomable render on the WR but I think results are quite promising Just some questions, is hard to do a image viewer able to scroll vertically as we do in text? Any good tutorial of scripting using gimp? An now some numbers the average image are 10kb so with the hypotesis than there are one image per article (yes I know there articles with more than a image but there a lot of articles without images) there will be about 3.000.000 images so 30Gb of images :P there are any 32Gb uSD cards out there? I have to do a more in depth analisis on how many images(meaningful) are there using the data on the dumps of wikipedia so we will see. [1]http://www.tuxbrain.com/en/content/images-wikireader-posible ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community This is an awasome idea David, but you have first to consider two things: 1- Not all English wikipedia images are under a cc license or similar. There're a lot of copyrighted images: logos, photograpsh, captured images from videogames... You've one warning in this wikipedia page: http://en.wikipedia.org/wiki/Wikipedia_database#Images_and_uploaded_files 2- It's possible to automatically download all the wikipedia images using a program called wikix (http://meta.wikimedia.org/wiki/Wikix) but someone tried it back in 2007 and the result had a size of ¡¡407 gb!! (http://yousefourabi.com/blog/2007/10/download-all-wikipedia-images-with-wikix/). Then, the task of downloading all the images and convert them should be done with a very good machine or cluster. PS: I know that for spanish wikipedia copyrighted images are not allowed and we don't have the point 2 problem :P -- View this message in context: http://n2.nabble.com/wikireader-Images-on-the-WR-not-so-imposible-P-was-wikireader-Error-on-parsing-the-spanish-wikipedia-tp3935879p3937360.html Sent from the Openmoko Community mailing list archive at Nabble.com. ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
2009/11/3 David Reyes Samblas Martinez da...@tuxbrain.com An now some numbers the average image are 10kb so with the hypotesis than there are one image per article (yes I know there articles with more than a image but there a lot of articles without images) there will be about 3.000.000 images so 30Gb of images :P there are any 32Gb uSD cards out there? Your pbm files are not compressed. I've tried compressing one with gzip and it went down by 50%. If you use some smart image format you can probably go down much more. ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
On Tuesday 03 November 2009 09:46:44 Alexander Shulgin wrote: Can we run zlib and, wait-wait... libpng on the device? :) Good point! png compress 1 bit images a lot! ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
Davide wrote: On Tuesday 03 November 2009 09:46:44 Alexander Shulgin wrote: Can we run zlib and, wait-wait... libpng on the device? :) Good point! png compress 1 bit images a lot! ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community You're right! One sample done with treshold tool in gimp and saved in png format: http://tinypic.com/r/mjs58m/4 -- View this message in context: http://n2.nabble.com/wikireader-Images-on-the-WR-not-so-imposible-P-was-wikireader-Error-on-parsing-the-spanish-wikipedia-tp3935879p3937629.html Sent from the Openmoko Community mailing list archive at Nabble.com. ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
Regarding compression, I believe lzma is already builded in the wikireader application and it compress the images a 50%. enough for start I guess. but I have to recongnize than the image on png looks really good do maybe it worth the meaning to implemente it on the device if it's not much resource hungry Regarding licensing , well until OM or/and Wikipiedia doesn't say the contrary (for example considering Wikireader as an extension of the Wikipedia and allow all wikipedia image to be on Wikireader) we must stay in the save side so only explicitly free licenced images will be safe to use, I'm working on the http://download.wikimedia.org/enwiki/latest/enwiki-latest-image.sql.gz table to know how many pictures we are talking about. Also some way to not infringe the authoring and licencing text includings clauses must be used by the images viewer. but I guess it can be done by links to text as other wikipage more. Regarding machine needed to do so, due we just need at maximum of 240 pixel with we can tweak the Wikix to use the thumb url like this http://upload.wikimedia.org/wikipedia/commons/thumb/f/f9/HN_Pegasi_B.jpg/240px-HN_Pegasi_B.jpg instead of the full url http://upload.wikimedia.org/wikipedia/commons/f/f9/HN_Pegasi_B.jpg and this will save us a lot disk space and a step in the process :P Also Wikix must be tweaked to just download free licenced images using the info on the enwiki-latest-image.sql.gz file then sure we will save a lot more disk space. David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! 2009/11/3 David Garabana Barro da...@garabana.com: On Tuesday 03 November 2009 09:46:44 Alexander Shulgin wrote: Can we run zlib and, wait-wait... libpng on the device? :) Good point! png compress 1 bit images a lot! ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
On Tue, Nov 03, 2009 at 12:15:11PM +0100, David Reyes Samblas Martinez wrote: Regarding licensing , well until OM or/and Wikipiedia doesn't say the contrary (for example considering Wikireader as an extension of the Wikipedia and allow all wikipedia image to be on Wikireader) we must stay in the save side so only explicitly free licenced images will be safe to use, I'm working on the http://download.wikimedia.org/enwiki/latest/enwiki-latest-image.sql.gz table to know how many pictures we are talking about. Also some way to not infringe the authoring and licencing text includings clauses must be used by the images viewer. but I guess it can be done by links to text as other wikipage more. The problem isn't so much about WikiMedia or OpenMoko, but that the original authors did not free the images. As such, whilst maybe they can be on Wikipedia, which is on a non-profit environment, distributing on the WikiReader (which is for-profit) may be legally problematic. If there's a way to automatically determine if the image is safe to copy (for instance, being licensed with a good CC license like by, by-sa) then it's doable. If not... it requires a lot of human filtering... Rui ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
On Tuesday 03 November 2009 12:15:11 David Reyes Samblas Martinez wrote: Regarding compression, I believe lzma is already builded in the wikireader application and it compress the images a 50%. enough for start I guess. but I have to recongnize than the image on png looks really good do maybe it worth the meaning to implemente it on the device if it's not much resource hungr Both png and pbm are 1 bit images without lossy compression. You can obtain exactly the same final image quality on both formats, but png will have smaller disk size. Final result only depends on RGB-1 bit indexed conversion method used. AFAIK png decompression is not resource hungry. Compression *IS*. PS For minimal png archive size, you *MUST* convert image to 1 bit indexed palette before saving it. If you use greyscale, RBG or more than 1 bit palette, png will waste space saving palette or RBG/greyscale info. ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
Rui Miguel Silva Seabra wrote: On Tue, Nov 03, 2009 at 12:15:11PM +0100, David Reyes Samblas Martinez wrote: Also some way to not infringe the authoring and licencing text includings clauses must be used by the images viewer. but I guess it can be done by links to text as other wikipage more. The problem isn't so much about WikiMedia or OpenMoko, but that the original authors did not free the images. As such, whilst maybe they can be on Wikipedia, which is on a non-profit environment, distributing on the WikiReader (which is for-profit) may be legally problematic. I'm not sure if this is a desired workflow. But I don't think whis will be a problem if everybody builds his own wikireder offline database. Meaning, Wikireader ships and maintains a database with all safe content. And if you like more you do it yourself. PS: I think it would be a good idea to only use pics with low dnamic in the first place. There is no use to have a van Gough on a 1bit low res screen. But having maps, flags, schematics and other low dynamic stuff makes total sense. I especially think about the huge amount of svg content. I imagine, that this can be fairly easily detected. (Maybe just simply by compression factor) PPS: Apropos SVG. I guess we can keep them as some kind of vector format to save space. PPPS: We need a mailinglist Tilman ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! 2009/11/3 David Garabana Barro da...@garabana.com: On Tuesday 03 November 2009 12:15:11 David Reyes Samblas Martinez wrote: Regarding compression, I believe lzma is already builded in the wikireader application and it compress the images a 50%. enough for start I guess. but I have to recongnize than the image on png looks really good do maybe it worth the meaning to implemente it on the device if it's not much resource hungr Both png and pbm are 1 bit images without lossy compression. You can obtain exactly the same final image quality on both formats, but png will have smaller disk size. As I said lzma compresed pbm files are about the same size like a png file so if same results can be achieved, I vote for stay on what's already implemented And seems this way isi commpressed a litte bit more I see the sample png file is 4263 bytes and the same pmb+lzma is about 2937 (sucotronic please can you email me with the name of treshold tool in spanish and post the values you chose if any?) Final result only depends on RGB-1 bit indexed conversion method used. AFAIK png decompression is not resource hungry. Compression *IS*. well compresion is done on host so no problem on this side, in fact WR is decompresing huge amount of text in lzma quite fast so a tiny file of 2-4Kb will be no problem PS For minimal png archive size, you *MUST* convert image to 1 bit indexed palette before saving it. If you use greyscale, RBG or more than 1 bit palette, png will waste space saving palette or RBG/greyscale info. totally agree :) ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
I'm not sure if this is a desired workflow. But I don't think whis will be a problem if everybody builds his own wikireder offline database. Meaning, Wikireader ships and maintains a database with all safe content. And if you like more you do it yourself. If the viewer is already implemented you parse/render the whole Wikipedia to include the images links already rip off in the official version with a --include-non-free option in make or a unnoficial patch to avoid this filtering..., It seems a great idea to me. then is up the (advanced)user to include this image or not and he is not taking any more profit than enjoying the images . I think is a good aproach for the licencing issue. PS: I think it would be a good idea to only use pics with low dnamic in the first place. There is no use to have a van Gough on a 1bit low res screen. I not agree with this, is clear than you cannot appreciate the subtle mastering of colors or the smart use of lights in a 1 bit color depth 240px width image :P but you can see How it looks like and in WR for me is far from enough, But having maps, flags, schematics and other low dynamic stuff makes total sense. I see the flags more problematic than van Gough ... a lot of them relies on colors to diferentiate each other so italian,french,irish, and all the miriad trhee vertical colors flags will be very hard differentiable I especially think about the huge amount of svg content. I imagine, that this can be fairly easily detected. (Maybe just simply by compression factor) or by his extension :P PPS: Apropos SVG. I guess we can keep them as some kind of vector format to save space. With the sizes we are talking abuout (3-4Kb once compressed), rarely a svg will be smaller than this, and I think reder a vector image is more resouce hungry than just a plain bitmap, but if the device can hold it it can be awesome as map viewer :) PPPS: We need a mailinglist Meanwhile people tag the topic I feel confortable in the OM list for a OM device Tilman ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
David Reyes Samblas Martinez wrote: But having maps, flags, schematics and other low dynamic stuff makes total sense. I see the flags more problematic than van Gough ... a lot of them relies on colors to diferentiate each other so italian,french,irish, and all the miriad trhee vertical colors flags will be very hard differentiable You see that effect on cheap newspaper prints. They have fairly large 1bit pixels. It works good enough. It's ugly for pictures. But very well for diagrams or anything like that. You won't have absolute colours but that still works well. PPS: Apropos SVG. I guess we can keep them as some kind of vector format to save space. With the sizes we are talking abuout (3-4Kb once compressed), rarely a svg will be smaller than this, and I think reder a vector image is more resouce hungry than just a plain bitmap, but if the device can hold it it can be awesome as map viewer :) True. 1 bit images are probably smaller then vector graphics. What would be a miss could maybe 'scrolling' (Paging) ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
On Tue, Nov 3, 2009 at 4:06 AM, David Reyes Samblas Martinez da...@tuxbrain.com wrote: [snip] I have find a process[1] I think it can be industrialized to transform any image of the wikipedia to one more or less good to the device is clear than we can expect a real time 3D zoomable render on the WR but I think results are quite promising One option for such industrialization of images converting may be something like this onliner using ImageMagic convert infile.png -geometry 240 +dither -colors 2 -colorspace gray -contrast-stretch 0 -normalize outfile.pbm For reference see http://www.imagemagick.org/Usage/quantize/ [snip] -- So long, and thanks for all the fish. ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
2009/11/3 Evgeniy Ginzburg nad@gmail.com On Tue, Nov 3, 2009 at 4:06 AM, David Reyes Samblas Martinez da...@tuxbrain.com wrote: [snip] I have find a process[1] I think it can be industrialized to transform any image of the wikipedia to one more or less good to the device is clear than we can expect a real time 3D zoomable render on the WR but I think results are quite promising One option for such industrialization of images converting may be something like this onliner using ImageMagic convert infile.png -geometry 240 +dither -colors 2 -colorspace gray -contrast-stretch 0 -normalize outfile.pbm For reference see http://www.imagemagick.org/Usage/quantize/ Ascii art could be nice too. And wouldn't require much hacking on the device side :-) ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
On Tue, Nov 3, 2009 at 6:55 PM, Michal Brzozowski ruso...@poczta.fm wrote: 2009/11/3 Evgeniy Ginzburg nad@gmail.com On Tue, Nov 3, 2009 at 4:06 AM, David Reyes Samblas Martinez da...@tuxbrain.com wrote: [snip] I have find a process[1] I think it can be industrialized to transform any image of the wikipedia to one more or less good to the device is clear than we can expect a real time 3D zoomable render on the WR but I think results are quite promising One option for such industrialization of images converting may be something like this onliner using ImageMagic convert infile.png -geometry 240 +dither -colors 2 -colorspace gray -contrast-stretch 0 -normalize outfile.pbm For reference see http://www.imagemagick.org/Usage/quantize/ Ascii art could be nice too. And wouldn't require much hacking on the device side :-) I've just tried to view 240 pixel wide images in ASCII, cannot see nothing. Using .PBM let you see (in worst case) something. -- So long, and thanks for all the fish. ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
2009/11/3 Evgeniy Ginzburg nad@gmail.com: On Tue, Nov 3, 2009 at 4:06 AM, David Reyes Samblas Martinez da...@tuxbrain.com wrote: [snip] I have find a process[1] I think it can be industrialized to transform any image of the wikipedia to one more or less good to the device is clear than we can expect a real time 3D zoomable render on the WR but I think results are quite promising One option for such industrialization of images converting may be something like this onliner using ImageMagic convert infile.png -geometry 240 +dither -colors 2 -colorspace gray -contrast-stretch 0 -normalize outfile.pbm For reference see http://www.imagemagick.org/Usage/quantize/ [snip] -- So long, and thanks for all the fish. ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community GIMP also alows to work with bactch process and scripting , I have to find how but I know it can, we will choose the option than better results will give David Reyes Samblas Martinez http://www.tuxbrain.com Open ultraportable embedded solutions Openmoko, Openpandora, Arduino Hey, watch out!!! There's a linux in your pocket!!! ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community
Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]
2009/11/3 David Reyes Samblas Martinez da...@tuxbrain.com: 2009/10/31 Sean Moss-Pultz s...@openmoko.com: On Fri, Oct 30, 2009 at 11:29 PM, Laszlo KREKACS laszlo.krekacs.l...@gmail.com wrote: On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez da...@tuxbrain.com wrote: Are you uploading this changes to git? can I take a look? Btw is there any plan to implement images rendering? Math (images) are on our roadmap. Hopefully before the end of this year. The screen is only 1bit. So anything else would look kinda funny. -Sean Well due I have clear than Internationalization and running other apps are totally posible and in fact it can be done without much hacking, I have spend some time investigating the posibility of include image other than maths on the device and I think is at least more closer than it can seems. I have find a process[1] I think it can be industrialized to transform any image of the wikipedia to one more or less good to the device is clear than we can expect a real time 3D zoomable render on the WR but I think results are quite promising Just some questions, is hard to do a image viewer able to scroll vertically as we do in text? Any good tutorial of scripting using gimp? An now some numbers the average image are 10kb so with the hypotesis than there are one image per article (yes I know there articles with more than a image but there a lot of articles without images) there will be about 3.000.000 images so 30Gb of images :P there are any 32Gb uSD cards out there? I have to do a more in depth analisis on how many images(meaningful) are there using the data on the dumps of wikipedia so we will see. [1]http://www.tuxbrain.com/en/content/images-wikireader-posible ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community [...]It's clear we can NOT expect 3D render[...] ___ Openmoko community mailing list community@lists.openmoko.org http://lists.openmoko.org/mailman/listinfo/community