Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread Alexander Shulgin
On Tue, Nov 3, 2009 at 03:06, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:

 An now some numbers the average image are 10kb so with the hypotesis
 than there are one image per article (yes I know there articles with
 more than a image but there a lot of articles without images) there
 will be about 3.000.000 images so 30Gb of images :P there are any
 32Gb uSD cards out there?

Can we run zlib and, wait-wait... libpng on the device? :)

--
Alex

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread Jörn Hagen

hi,

the wish is clear:), but the images on wikipedia are slightly
problematic, see

 
http://en.wikipedia.org/wiki/Wikipedia:Database_download#Images_and_uploaded_files

and

 
http://en.wikipedia.org/wiki/Wikipedia:Copyrights#Non-free_materials_and_special_requirements

on the other hand, see:):

  http://meta.wikimedia.org/wiki/Wikix

-Jörn


David Reyes Samblas Martinez wrote:
 2009/10/31 Sean Moss-Pultz s...@openmoko.com:
   
 On Fri, Oct 30, 2009 at 11:29 PM, Laszlo KREKACS
 laszlo.krekacs.l...@gmail.com wrote:
 
 On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
   
 Are you uploading this changes to git? can I take a look?
 
 Btw is there any plan to implement images rendering?
   
 Math (images) are on our roadmap. Hopefully before the end of this
 year. The screen is only 1bit. So anything else would look kinda
 funny.

  -Sean
 
 Well due I have clear than Internationalization and running other apps
 are totally posible and in fact it can be done without much hacking, I
 have spend some time investigating the posibility of  include image
 other than maths on the device and I think is at least more closer
 than it can seems.
 I have find a process[1] I think it can be industrialized to transform
 any image of the wikipedia to one more or less good to the device is
 clear than we can expect a real time 3D zoomable render on the WR but
 I think results are quite promising

 Just some questions, is hard to do a image viewer able to scroll
 vertically as we do in text?
 Any good tutorial of scripting using gimp?

 An now some numbers the average image are 10kb so with the hypotesis
 than there are one image per article (yes I know there articles with
 more than a image but there a lot of articles without images) there
 will be about 3.000.000 images so 30Gb of images :P there are any
 32Gb uSD cards out there?
 I have to do a more in depth analisis on how many images(meaningful)
 are there using the data on the dumps of wikipedia  so we will see.

 [1]http://www.tuxbrain.com/en/content/images-wikireader-posible

   
 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community

 

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community

   



smime.p7s
Description: S/MIME Cryptographic Signature
___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread sucotronic



David Samblas Martinez wrote:
 
 2009/10/31 Sean Moss-Pultz s...@openmoko.com:
 On Fri, Oct 30, 2009 at 11:29 PM, Laszlo KREKACS
 laszlo.krekacs.l...@gmail.com wrote:
 On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Are you uploading this changes to git? can I take a look?

 Btw is there any plan to implement images rendering?

 Math (images) are on our roadmap. Hopefully before the end of this
 year. The screen is only 1bit. So anything else would look kinda
 funny.

  -Sean
 Well due I have clear than Internationalization and running other apps
 are totally posible and in fact it can be done without much hacking, I
 have spend some time investigating the posibility of  include image
 other than maths on the device and I think is at least more closer
 than it can seems.
 I have find a process[1] I think it can be industrialized to transform
 any image of the wikipedia to one more or less good to the device is
 clear than we can expect a real time 3D zoomable render on the WR but
 I think results are quite promising
 
 Just some questions, is hard to do a image viewer able to scroll
 vertically as we do in text?
 Any good tutorial of scripting using gimp?
 
 An now some numbers the average image are 10kb so with the hypotesis
 than there are one image per article (yes I know there articles with
 more than a image but there a lot of articles without images) there
 will be about 3.000.000 images so 30Gb of images :P there are any
 32Gb uSD cards out there?
 I have to do a more in depth analisis on how many images(meaningful)
 are there using the data on the dumps of wikipedia  so we will see.
 
 [1]http://www.tuxbrain.com/en/content/images-wikireader-posible
 

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community

 
 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community
 
 

This is an awasome idea David, but you have first to consider two things:

1- Not all English wikipedia images are under a cc license or similar.
There're a lot of copyrighted images: logos, photograpsh, captured images
from videogames...
You've one warning in this wikipedia page:
http://en.wikipedia.org/wiki/Wikipedia_database#Images_and_uploaded_files

2- It's possible to automatically download all the wikipedia images using a
program called wikix (http://meta.wikimedia.org/wiki/Wikix) but someone
tried it back in 2007 and  the result had a size of ¡¡407 gb!!
(http://yousefourabi.com/blog/2007/10/download-all-wikipedia-images-with-wikix/).
Then, the task of downloading all the images and convert them should be done
with a very good machine or cluster.

PS: I know that for spanish wikipedia copyrighted images are not allowed and
we don't have the point 2 problem :P

-- 
View this message in context: 
http://n2.nabble.com/wikireader-Images-on-the-WR-not-so-imposible-P-was-wikireader-Error-on-parsing-the-spanish-wikipedia-tp3935879p3937360.html
Sent from the Openmoko Community mailing list archive at Nabble.com.

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread Michal Brzozowski
2009/11/3 David Reyes Samblas Martinez da...@tuxbrain.com

 An now some numbers the average image are 10kb so with the hypotesis
 than there are one image per article (yes I know there articles with
 more than a image but there a lot of articles without images) there
 will be about 3.000.000 images so 30Gb of images :P there are any
 32Gb uSD cards out there?


Your pbm files are not compressed. I've tried compressing one with gzip and
it went down by 50%. If you use some smart image format you can probably go
down much more.
___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread David Garabana Barro
On Tuesday 03 November 2009 09:46:44 Alexander Shulgin wrote:

 Can we run zlib and, wait-wait... libpng on the device? :)

Good point!
png compress 1 bit images a lot!

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread sucotronic



Davide wrote:
 
 On Tuesday 03 November 2009 09:46:44 Alexander Shulgin wrote:
 
 Can we run zlib and, wait-wait... libpng on the device? :)
 
 Good point!
 png compress 1 bit images a lot!
 
 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community
 
 

You're right!
One sample done with treshold tool in gimp and saved in png format:
http://tinypic.com/r/mjs58m/4
-- 
View this message in context: 
http://n2.nabble.com/wikireader-Images-on-the-WR-not-so-imposible-P-was-wikireader-Error-on-parsing-the-spanish-wikipedia-tp3935879p3937629.html
Sent from the Openmoko Community mailing list archive at Nabble.com.

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread David Reyes Samblas Martinez
Regarding compression, I believe lzma is already builded in the
wikireader application and it compress the images  a 50%. enough for
start I guess. but I have to recongnize than the image on png looks
really good do maybe it worth the meaning to implemente it on the
device if it's not much resource hungry

Regarding licensing , well  until OM or/and Wikipiedia doesn't say the
contrary (for example considering Wikireader as an extension of the
Wikipedia and allow all wikipedia image to be on Wikireader) we must
stay in the save side so only explicitly free licenced images will be
safe to use, I'm working on the
http://download.wikimedia.org/enwiki/latest/enwiki-latest-image.sql.gz
table to know how many pictures  we are talking about.
Also some way to not infringe the authoring and licencing text
includings clauses must be used by the images viewer. but I guess it
can be done by links to text as other wikipage more.

Regarding machine needed to do so, due we just need at maximum of 240
pixel with we can tweak the Wikix to use the thumb url like this
http://upload.wikimedia.org/wikipedia/commons/thumb/f/f9/HN_Pegasi_B.jpg/240px-HN_Pegasi_B.jpg
instead of the full url

http://upload.wikimedia.org/wikipedia/commons/f/f9/HN_Pegasi_B.jpg

and this will save us a lot disk space and a step in the process :P

Also Wikix must be tweaked to just download free licenced images
using the info on the enwiki-latest-image.sql.gz file then sure we
will save a lot more disk space.

David Reyes Samblas Martinez
http://www.tuxbrain.com
Open ultraportable  embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!




2009/11/3 David Garabana Barro da...@garabana.com:
 On Tuesday 03 November 2009 09:46:44 Alexander Shulgin wrote:

 Can we run zlib and, wait-wait... libpng on the device? :)

 Good point!
 png compress 1 bit images a lot!

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community


___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread Rui Miguel Silva Seabra
On Tue, Nov 03, 2009 at 12:15:11PM +0100, David Reyes Samblas Martinez wrote:
 Regarding licensing , well  until OM or/and Wikipiedia doesn't say the
 contrary (for example considering Wikireader as an extension of the
 Wikipedia and allow all wikipedia image to be on Wikireader) we must
 stay in the save side so only explicitly free licenced images will be
 safe to use, I'm working on the
 http://download.wikimedia.org/enwiki/latest/enwiki-latest-image.sql.gz
 table to know how many pictures  we are talking about.
 Also some way to not infringe the authoring and licencing text
 includings clauses must be used by the images viewer. but I guess it
 can be done by links to text as other wikipage more.

The problem isn't so much about WikiMedia or OpenMoko, but that the
original authors did not free the images.

As such, whilst maybe they can be on Wikipedia, which is on a non-profit
environment, distributing on the WikiReader (which is for-profit) may
be legally problematic.

If there's a way to automatically determine if the image is safe to copy
(for instance, being licensed with a good CC license like by, by-sa) then
it's doable. If not... it requires a lot of human filtering...

Rui

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread David Garabana Barro
On Tuesday 03 November 2009 12:15:11 David Reyes Samblas Martinez wrote:
 Regarding compression, I believe lzma is already builded in the
 wikireader application and it compress the images  a 50%. enough for
 start I guess. but I have to recongnize than the image on png looks
 really good do maybe it worth the meaning to implemente it on the
 device if it's not much resource hungr

Both png and pbm are 1 bit images without lossy compression.
You can obtain exactly the same final image quality on both formats, but png 
will have smaller disk size.
Final result only depends on RGB-1 bit indexed conversion method used.

AFAIK png decompression is not resource hungry. Compression *IS*.

PS For minimal png archive size, you *MUST* convert image to 1 bit indexed 
palette before saving it. 
If you use greyscale, RBG or more than 1 bit palette, png will waste space 
saving palette or RBG/greyscale info.

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread Tilman Baumann
Rui Miguel Silva Seabra wrote:
 On Tue, Nov 03, 2009 at 12:15:11PM +0100, David Reyes Samblas Martinez
 wrote:

 Also some way to not infringe the authoring and licencing text
 includings clauses must be used by the images viewer. but I guess it
 can be done by links to text as other wikipage more.

 The problem isn't so much about WikiMedia or OpenMoko, but that the
 original authors did not free the images.

 As such, whilst maybe they can be on Wikipedia, which is on a non-profit
 environment, distributing on the WikiReader (which is for-profit) may
 be legally problematic.

I'm not sure if this is a desired workflow. But I don't think whis will be
a problem if everybody builds his own wikireder offline database.

Meaning, Wikireader ships and maintains a database with all safe content.
And if you like more you do it yourself.

PS: I think it would be a good idea to only use pics with low dnamic in
the first place. There is no use to have a van Gough on a 1bit low res
screen.
But having maps, flags, schematics and other low dynamic stuff makes total
sense.
I especially think about the huge amount of svg content.
I imagine, that this can be fairly easily detected. (Maybe just simply by
compression factor)

PPS: Apropos SVG. I guess we can keep them as some kind of vector format
to save space.

PPPS: We need a mailinglist

 Tilman


___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread David Reyes Samblas Martinez
David Reyes Samblas Martinez
http://www.tuxbrain.com
Open ultraportable  embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!




2009/11/3 David Garabana Barro da...@garabana.com:
 On Tuesday 03 November 2009 12:15:11 David Reyes Samblas Martinez wrote:
 Regarding compression, I believe lzma is already builded in the
 wikireader application and it compress the images  a 50%. enough for
 start I guess. but I have to recongnize than the image on png looks
 really good do maybe it worth the meaning to implemente it on the
 device if it's not much resource hungr

 Both png and pbm are 1 bit images without lossy compression.
 You can obtain exactly the same final image quality on both formats, but png
 will have smaller disk size.
As I said lzma compresed pbm files are about the same size like a png
file so if same results can be achieved, I vote for stay on what's
already implemented
And seems this way isi commpressed a litte bit more
I see the sample png file is 4263 bytes and the same pmb+lzma is about 2937
(sucotronic please can you email me with the name of  treshold tool
in spanish and post the values you  chose if any?)
 Final result only depends on RGB-1 bit indexed conversion method used.


 AFAIK png decompression is not resource hungry. Compression *IS*.
well compresion is done on host so no problem on this side, in fact WR
is decompresing huge amount of text in lzma quite fast so a tiny file
of 2-4Kb will be no problem

 PS For minimal png archive size, you *MUST* convert image to 1 bit indexed
 palette before saving it.
 If you use greyscale, RBG or more than 1 bit palette, png will waste space
 saving palette or RBG/greyscale info.
totally agree :)

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community


___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread David Reyes Samblas Martinez
 I'm not sure if this is a desired workflow. But I don't think whis will be
 a problem if everybody builds his own wikireder offline database.

 Meaning, Wikireader ships and maintains a database with all safe content.
 And if you like more you do it yourself.
If the viewer is already implemented
you parse/render the whole Wikipedia to include the images links
already rip off in the official version with a --include-non-free
option  in make or a unnoficial patch to avoid this filtering..., It
seems a great idea to me. then is up the (advanced)user to include
this image or not and he is not taking any more profit than enjoying
the images . I think is a good aproach for the licencing issue.

 PS: I think it would be a good idea to only use pics with low dnamic in
 the first place. There is no use to have a van Gough on a 1bit low res
 screen.
I not agree with this, is clear than you cannot appreciate the subtle
mastering of colors or the smart use of lights in a 1 bit color depth
240px width image :P but you can see How it looks like and in WR for
me is far from enough,
 But having maps, flags, schematics and other low dynamic stuff makes total
 sense.
I see the flags more problematic than van Gough ... a lot of them
relies on colors to diferentiate each other so italian,french,irish,
and all the miriad trhee vertical colors flags will be very hard
differentiable
 I especially think about the huge amount of svg content.
 I imagine, that this can be fairly easily detected. (Maybe just simply by
 compression factor)
or by his extension :P

 PPS: Apropos SVG. I guess we can keep them as some kind of vector format
 to save space.
With the sizes we are talking abuout (3-4Kb once compressed), rarely a
svg will be smaller than this, and I think reder a vector image is
more resouce hungry than just a plain bitmap, but if the device can
hold it it can be awesome as map viewer :)

 PPPS: We need a mailinglist
Meanwhile people tag the topic I feel confortable in the OM list for a OM device

  Tilman


 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community


David Reyes Samblas Martinez
http://www.tuxbrain.com
Open ultraportable  embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread Tilman Baumann
David Reyes Samblas Martinez wrote:

 But having maps, flags, schematics and other low dynamic stuff makes
 total
 sense.
 I see the flags more problematic than van Gough ... a lot of them
 relies on colors to diferentiate each other so italian,french,irish,
 and all the miriad trhee vertical colors flags will be very hard
 differentiable

You see that effect on cheap newspaper prints. They have fairly large 1bit
pixels. It works good enough.
It's ugly for pictures. But very well for diagrams or anything like that.

You won't have absolute colours but that still works well.

 PPS: Apropos SVG. I guess we can keep them as some kind of vector format
 to save space.
 With the sizes we are talking abuout (3-4Kb once compressed), rarely a
 svg will be smaller than this, and I think reder a vector image is
 more resouce hungry than just a plain bitmap, but if the device can
 hold it it can be awesome as map viewer :)

True. 1 bit images are probably smaller then vector graphics. What would
be a miss could maybe 'scrolling' (Paging)





___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread Evgeniy Ginzburg
On Tue, Nov 3, 2009 at 4:06 AM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
[snip]
 I have find a process[1] I think it can be industrialized to transform
 any image of the wikipedia to one more or less good to the device is
 clear than we can expect a real time 3D zoomable render on the WR but
 I think results are quite promising
One option for such industrialization of images converting may be
something like this onliner using ImageMagic

convert  infile.png  -geometry 240 +dither -colors 2 -colorspace gray
-contrast-stretch 0 -normalize  outfile.pbm

For reference see http://www.imagemagick.org/Usage/quantize/

[snip]
-- 
So long, and thanks for all the fish.

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread Michal Brzozowski
2009/11/3 Evgeniy Ginzburg nad@gmail.com

 On Tue, Nov 3, 2009 at 4:06 AM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 [snip]
  I have find a process[1] I think it can be industrialized to transform
  any image of the wikipedia to one more or less good to the device is
  clear than we can expect a real time 3D zoomable render on the WR but
  I think results are quite promising
 One option for such industrialization of images converting may be
 something like this onliner using ImageMagic

 convert  infile.png  -geometry 240 +dither -colors 2 -colorspace gray
 -contrast-stretch 0 -normalize  outfile.pbm

 For reference see http://www.imagemagick.org/Usage/quantize/


Ascii art could be nice too. And wouldn't require much hacking on the device
side :-)
___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread Evgeniy Ginzburg
On Tue, Nov 3, 2009 at 6:55 PM, Michal Brzozowski ruso...@poczta.fm wrote:
 2009/11/3 Evgeniy Ginzburg nad@gmail.com

 On Tue, Nov 3, 2009 at 4:06 AM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 [snip]
  I have find a process[1] I think it can be industrialized to transform
  any image of the wikipedia to one more or less good to the device is
  clear than we can expect a real time 3D zoomable render on the WR but
  I think results are quite promising
 One option for such industrialization of images converting may be
 something like this onliner using ImageMagic

 convert  infile.png  -geometry 240 +dither -colors 2 -colorspace gray
 -contrast-stretch 0 -normalize  outfile.pbm

 For reference see http://www.imagemagick.org/Usage/quantize/

 Ascii art could be nice too. And wouldn't require much hacking on the device
 side :-)

I've just tried to view 240 pixel wide images in ASCII, cannot see nothing.
Using .PBM let you see (in worst case) something.



-- 
So long, and thanks for all the fish.

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-03 Thread David Reyes Samblas Martinez
2009/11/3 Evgeniy Ginzburg nad@gmail.com:
 On Tue, Nov 3, 2009 at 4:06 AM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 [snip]
 I have find a process[1] I think it can be industrialized to transform
 any image of the wikipedia to one more or less good to the device is
 clear than we can expect a real time 3D zoomable render on the WR but
 I think results are quite promising
 One option for such industrialization of images converting may be
 something like this onliner using ImageMagic

 convert  infile.png  -geometry 240 +dither -colors 2 -colorspace gray
 -contrast-stretch 0 -normalize  outfile.pbm

 For reference see http://www.imagemagick.org/Usage/quantize/

 [snip]
 --
 So long, and thanks for all the fish.

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community

GIMP also alows to work with bactch process and scripting , I have to
find how but I know it can, we will choose the option than better
results will give

David Reyes Samblas Martinez
http://www.tuxbrain.com
Open ultraportable  embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!




___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


[wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-02 Thread David Reyes Samblas Martinez
2009/10/31 Sean Moss-Pultz s...@openmoko.com:
 On Fri, Oct 30, 2009 at 11:29 PM, Laszlo KREKACS
 laszlo.krekacs.l...@gmail.com wrote:
 On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Are you uploading this changes to git? can I take a look?

 Btw is there any plan to implement images rendering?

 Math (images) are on our roadmap. Hopefully before the end of this
 year. The screen is only 1bit. So anything else would look kinda
 funny.

  -Sean
Well due I have clear than Internationalization and running other apps
are totally posible and in fact it can be done without much hacking, I
have spend some time investigating the posibility of  include image
other than maths on the device and I think is at least more closer
than it can seems.
I have find a process[1] I think it can be industrialized to transform
any image of the wikipedia to one more or less good to the device is
clear than we can expect a real time 3D zoomable render on the WR but
I think results are quite promising

Just some questions, is hard to do a image viewer able to scroll
vertically as we do in text?
Any good tutorial of scripting using gimp?

An now some numbers the average image are 10kb so with the hypotesis
than there are one image per article (yes I know there articles with
more than a image but there a lot of articles without images) there
will be about 3.000.000 images so 30Gb of images :P there are any
32Gb uSD cards out there?
I have to do a more in depth analisis on how many images(meaningful)
are there using the data on the dumps of wikipedia  so we will see.

[1]http://www.tuxbrain.com/en/content/images-wikireader-posible


 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community


___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader] Images on the WR not so imposible :P [was [wikireader]Error on parsing the spanish wikipedia]

2009-11-02 Thread David Reyes Samblas Martinez
2009/11/3 David Reyes Samblas Martinez da...@tuxbrain.com:
 2009/10/31 Sean Moss-Pultz s...@openmoko.com:
 On Fri, Oct 30, 2009 at 11:29 PM, Laszlo KREKACS
 laszlo.krekacs.l...@gmail.com wrote:
 On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Are you uploading this changes to git? can I take a look?

 Btw is there any plan to implement images rendering?

 Math (images) are on our roadmap. Hopefully before the end of this
 year. The screen is only 1bit. So anything else would look kinda
 funny.

  -Sean
 Well due I have clear than Internationalization and running other apps
 are totally posible and in fact it can be done without much hacking, I
 have spend some time investigating the posibility of  include image
 other than maths on the device and I think is at least more closer
 than it can seems.
 I have find a process[1] I think it can be industrialized to transform
 any image of the wikipedia to one more or less good to the device is
 clear than we can expect a real time 3D zoomable render on the WR but
 I think results are quite promising

 Just some questions, is hard to do a image viewer able to scroll
 vertically as we do in text?
 Any good tutorial of scripting using gimp?

 An now some numbers the average image are 10kb so with the hypotesis
 than there are one image per article (yes I know there articles with
 more than a image but there a lot of articles without images) there
 will be about 3.000.000 images so 30Gb of images :P there are any
 32Gb uSD cards out there?
 I have to do a more in depth analisis on how many images(meaningful)
 are there using the data on the dumps of wikipedia  so we will see.

 [1]http://www.tuxbrain.com/en/content/images-wikireader-posible


 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community


[...]It's clear we can NOT expect 3D render[...]

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread Sean Moss-Pultz
On Fri, Oct 30, 2009 at 4:50 AM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
 after compiling succsesfuly the source on the git and solve some
 annoyings with utf8 encoding on phyton error was somthing like this:
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
 position: ordinal not in range(128)
 this was solved changing the default encode ascii to utf8 int the
 /usr/lib/python2.6/site.py file
 after this I was hable to execute ok the instruction:
 make DESTDIR=image WORKDIR=work
 XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
 parse render combine

 Every thing seem fine for a couple(about 6-7h) of hours parsing the
 70 articles in spanish but  then ... the horror
 Count: 38
 Traceback (most recent call last):
  File ./ArticleParser.py, line 224, in module
    main()
  File ./ArticleParser.py, line 172, in main
    process_article_text(title.encode('utf-8'),  f.read(length), newf)
  File ./ArticleParser.py, line 218, in process_article_text
    newf.write(text + '\n')
 IOError: [Errno 32] Broken pipe
 make[1]: *** [parse] Error 1
 make[1]: se sale del directorio
 `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
 make: *** [parse] Error 2

OK that's fixed now. Chris already checked in the code. Our build
worked fine. We need to do a few more tweaks and then we can post a
(super) early test image. Give us until early this coming week.

  -Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread David Reyes Samblas Martinez
Are you uploading this changes to git? can I take a look?

David Reyes Samblas Martinez
http://www.tuxbrain.com
Open ultraportable  embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!




2009/10/30 Sean Moss-Pultz s...@openmoko.com:
 On Fri, Oct 30, 2009 at 4:50 AM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
 after compiling succsesfuly the source on the git and solve some
 annoyings with utf8 encoding on phyton error was somthing like this:
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
 position: ordinal not in range(128)
 this was solved changing the default encode ascii to utf8 int the
 /usr/lib/python2.6/site.py file
 after this I was hable to execute ok the instruction:
 make DESTDIR=image WORKDIR=work
 XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
 parse render combine

 Every thing seem fine for a couple(about 6-7h) of hours parsing the
 70 articles in spanish but  then ... the horror
 Count: 38
 Traceback (most recent call last):
  File ./ArticleParser.py, line 224, in module
    main()
  File ./ArticleParser.py, line 172, in main
    process_article_text(title.encode('utf-8'),  f.read(length), newf)
  File ./ArticleParser.py, line 218, in process_article_text
    newf.write(text + '\n')
 IOError: [Errno 32] Broken pipe
 make[1]: *** [parse] Error 1
 make[1]: se sale del directorio
 `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
 make: *** [parse] Error 2

 OK that's fixed now. Chris already checked in the code. Our build
 worked fine. We need to do a few more tweaks and then we can post a
 (super) early test image. Give us until early this coming week.

  -Sean

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community


___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread Laszlo KREKACS
On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 Are you uploading this changes to git? can I take a look?

Btw is there any plan to implement images rendering?

If so, any time estimation?

Best regards,
 Laszlo

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread David Reyes Samblas Martinez
just an think I realized , all faulty articles the title starts with
the ~ simbol
regards
David Reyes Samblas Martinez
http://www.tuxbrain.com
Open ultraportable  embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!




2009/10/30 David Reyes Samblas Martinez da...@tuxbrain.com:
 Are you uploading this changes to git? can I take a look?

 David Reyes Samblas Martinez
 http://www.tuxbrain.com
 Open ultraportable  embedded solutions
 Openmoko, Openpandora,  Arduino
 Hey, watch out!!! There's a linux in your pocket!!!




 2009/10/30 Sean Moss-Pultz s...@openmoko.com:
 On Fri, Oct 30, 2009 at 4:50 AM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
 after compiling succsesfuly the source on the git and solve some
 annoyings with utf8 encoding on phyton error was somthing like this:
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
 position: ordinal not in range(128)
 this was solved changing the default encode ascii to utf8 int the
 /usr/lib/python2.6/site.py file
 after this I was hable to execute ok the instruction:
 make DESTDIR=image WORKDIR=work
 XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
 parse render combine

 Every thing seem fine for a couple(about 6-7h) of hours parsing the
 70 articles in spanish but  then ... the horror
 Count: 38
 Traceback (most recent call last):
  File ./ArticleParser.py, line 224, in module
    main()
  File ./ArticleParser.py, line 172, in main
    process_article_text(title.encode('utf-8'),  f.read(length), newf)
  File ./ArticleParser.py, line 218, in process_article_text
    newf.write(text + '\n')
 IOError: [Errno 32] Broken pipe
 make[1]: *** [parse] Error 1
 make[1]: se sale del directorio
 `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
 make: *** [parse] Error 2

 OK that's fixed now. Chris already checked in the code. Our build
 worked fine. We need to do a few more tweaks and then we can post a
 (super) early test image. Give us until early this coming week.

  -Sean

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community



___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread David Reyes Samblas Martinez
2009/10/30 Laszlo KREKACS laszlo.krekacs.l...@gmail.com:
 On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Are you uploading this changes to git? can I take a look?

 Btw is there any plan to implement images rendering?

 If so, any time estimation?

 Best regards,
  Laszlo

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community

Some kind of renderer has been already implemented because keyboard,
and the erase history dialog are images .  I'm wrong?

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread Sean Moss-Pultz
On Fri, Oct 30, 2009 at 11:22 PM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 Are you uploading this changes to git? can I take a look?

Yes. The latest commit fixes it. Have a look here:

  http://github.com/wikireader/wikireader

Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread Sean Moss-Pultz
On Fri, Oct 30, 2009 at 11:29 PM, Laszlo KREKACS
laszlo.krekacs.l...@gmail.com wrote:
 On Fri, Oct 30, 2009 at 4:22 PM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Are you uploading this changes to git? can I take a look?

 Btw is there any plan to implement images rendering?

Math (images) are on our roadmap. Hopefully before the end of this
year. The screen is only 1bit. So anything else would look kinda
funny.

 -Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-30 Thread Sean Moss-Pultz
On Sat, Oct 31, 2009 at 2:46 AM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 just an think I realized , all faulty articles the title starts with
 the ~ simbol

David

No that's not a problem. That character gets removed in a later build
stage. We had to add that because of a integer conversion issue with
SQLite. It was automatically converting articles like 1984 into
integers (not strings) and storing them in the database.

SQLite, BTW, claims this is a feature.

Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


[wikireader]Error on parsing the spanish wikipedia

2009-10-29 Thread David Reyes Samblas Martinez
Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
after compiling succsesfuly the source on the git and solve some
annoyings with utf8 encoding on phyton error was somthing like this:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
position: ordinal not in range(128)
this was solved changing the default encode ascii to utf8 int the
/usr/lib/python2.6/site.py file
after this I was hable to execute ok the instruction:
make DESTDIR=image WORKDIR=work
XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
parse render combine

Every thing seem fine for a couple(about 6-7h) of hours parsing the
70 articles in spanish but  then ... the horror
Count: 38
Traceback (most recent call last):
  File ./ArticleParser.py, line 224, in module
main()
  File ./ArticleParser.py, line 172, in main
process_article_text(title.encode('utf-8'),  f.read(length), newf)
  File ./ArticleParser.py, line 218, in process_article_text
newf.write(text + '\n')
IOError: [Errno 32] Broken pipe
make[1]: *** [parse] Error 1
make[1]: se sale del directorio
`/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
make: *** [parse] Error 2

I have relaunched the process again with the (few)hope that was a
temporary fault but If any one has a clue will be helpfull.

BTW.- I documenting all this proccess to make a step by step howto on
how to put the wikipedia in other languages on the wikireader.



David Reyes Samblas Martinez
http://www.tuxbrain.com
Open ultraportable  embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-29 Thread Sean Moss-Pultz
David

We're working on exactly the same thing now :-)

I'll ask Chris to email the list once we get past it. I think the
problem is with the mixtures of different encodings (latin-1 and
UTF-8) in the Spanish Wikipedia and the way our code is handling this.
For some reason Python's print  (at times) wants to default to ascii,
even after we explicitly tell it to use UTF-8.

  -Sean


On Fri, Oct 30, 2009 at 4:50 AM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:

 Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
 after compiling succsesfuly the source on the git and solve some
 annoyings with utf8 encoding on phyton error was somthing like this:
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
 position: ordinal not in range(128)
 this was solved changing the default encode ascii to utf8 int the
 /usr/lib/python2.6/site.py file
 after this I was hable to execute ok the instruction:
 make DESTDIR=image WORKDIR=work
 XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
 parse render combine

 Every thing seem fine for a couple(about 6-7h) of hours parsing the
 70 articles in spanish but  then ... the horror
 Count: 38
 Traceback (most recent call last):
  File ./ArticleParser.py, line 224, in module
    main()
  File ./ArticleParser.py, line 172, in main
    process_article_text(title.encode('utf-8'),  f.read(length), newf)
  File ./ArticleParser.py, line 218, in process_article_text
    newf.write(text + '\n')
 IOError: [Errno 32] Broken pipe
 make[1]: *** [parse] Error 1
 make[1]: se sale del directorio
 `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
 make: *** [parse] Error 2

 I have relaunched the process again with the (few)hope that was a
 temporary fault but If any one has a clue will be helpfull.

 BTW.- I documenting all this proccess to make a step by step howto on
 how to put the wikipedia in other languages on the wikireader.



 David Reyes Samblas Martinez
 http://www.tuxbrain.com
 Open ultraportable  embedded solutions
 Openmoko, Openpandora,  Arduino
 Hey, watch out!!! There's a linux in your pocket!!!

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-29 Thread David Reyes Samblas Martinez
Great! :) good to see you are working on this!, please count on me for
any testing to be done, I will try to make a look on the code myself
to kill the bug but no time and nor expertise so no promises :P
David Reyes Samblas Martinez
http://www.tuxbrain.com
Open ultraportable  embedded solutions
Openmoko, Openpandora,  Arduino
Hey, watch out!!! There's a linux in your pocket!!!




2009/10/30 Sean Moss-Pultz s...@openmoko.com:
 David

 We're working on exactly the same thing now :-)

 I'll ask Chris to email the list once we get past it. I think the
 problem is with the mixtures of different encodings (latin-1 and
 UTF-8) in the Spanish Wikipedia and the way our code is handling this.
 For some reason Python's print  (at times) wants to default to ascii,
 even after we explicitly tell it to use UTF-8.

  -Sean


 On Fri, Oct 30, 2009 at 4:50 AM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:

 Hi I'm trying to generate the file for a spainsh wikipedia on the WR ,
 after compiling succsesfuly the source on the git and solve some
 annoyings with utf8 encoding on phyton error was somthing like this:
 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in
 position: ordinal not in range(128)
 this was solved changing the default encode ascii to utf8 int the
 /usr/lib/python2.6/site.py file
 after this I was hable to execute ok the instruction:
 make DESTDIR=image WORKDIR=work
 XML_FILES=xml-file-samples/eswiki-latest-pages-articles.xml index
 parse render combine

 Every thing seem fine for a couple(about 6-7h) of hours parsing the
 70 articles in spanish but  then ... the horror
 Count: 38
 Traceback (most recent call last):
  File ./ArticleParser.py, line 224, in module
    main()
  File ./ArticleParser.py, line 172, in main
    process_article_text(title.encode('utf-8'),  f.read(length), newf)
  File ./ArticleParser.py, line 218, in process_article_text
    newf.write(text + '\n')
 IOError: [Errno 32] Broken pipe
 make[1]: *** [parse] Error 1
 make[1]: se sale del directorio
 `/OE/Proyectos/tuxbrain/productos/wikireader/wikireader/host-tools/offline-renderer'
 make: *** [parse] Error 2

 I have relaunched the process again with the (few)hope that was a
 temporary fault but If any one has a clue will be helpfull.

 BTW.- I documenting all this proccess to make a step by step howto on
 how to put the wikipedia in other languages on the wikireader.



 David Reyes Samblas Martinez
 http://www.tuxbrain.com
 Open ultraportable  embedded solutions
 Openmoko, Openpandora,  Arduino
 Hey, watch out!!! There's a linux in your pocket!!!

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community

 ___
 Openmoko community mailing list
 community@lists.openmoko.org
 http://lists.openmoko.org/mailman/listinfo/community


___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-29 Thread Nelson Castillo
On Thu, Oct 29, 2009 at 6:54 PM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 Great! :) good to see you are working on this!, please count on me for
 any testing to be done, I will try to make a look on the code myself
 to kill the bug but no time and nor expertise so no promises :P

I haven't seen the code but if you don't feel like fixing it now you
can add a try/catch on the block that is processing each page so that
you have a wiki to play with while the error is fixed.

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-29 Thread Sean Moss-Pultz
On Fri, Oct 30, 2009 at 7:54 AM, David Reyes Samblas Martinez
da...@tuxbrain.com wrote:
 Great! :) good to see you are working on this!, please count on me for
 any testing to be done, I will try to make a look on the code myself
 to kill the bug but no time and nor expertise so no promises :P

We'll get it working. Just give us a bit of time. And it would be
super helpful if you could help test / QA. Thanks a lot for the offer!

  -Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community


Re: [wikireader]Error on parsing the spanish wikipedia

2009-10-29 Thread Sean Moss-Pultz
On Fri, Oct 30, 2009 at 7:58 AM, Nelson Castillo
arhu...@freaks-unidos.net wrote:
 On Thu, Oct 29, 2009 at 6:54 PM, David Reyes Samblas Martinez
 da...@tuxbrain.com wrote:
 Great! :) good to see you are working on this!, please count on me for
 any testing to be done, I will try to make a look on the code myself
 to kill the bug but no time and nor expertise so no promises :P

 I haven't seen the code but if you don't feel like fixing it now you
 can add a try/catch on the block that is processing each page so that
 you have a wiki to play with while the error is fixed.

Yeah we're trying exactly that Nelson. It's just a long process to
render all this stuff. We actually have 9 quad-core systems running in
parallel now. Each with at least six GB of ram :-)

  -Sean

___
Openmoko community mailing list
community@lists.openmoko.org
http://lists.openmoko.org/mailman/listinfo/community