Re: Including Adobe CMaps

2014-02-28 Thread Gervase Markham
On 26/02/14 20:21, Jonathan Kew wrote: Lets turn this question around. If we had an on-demand way to load stuff like this, what else would we want to load on demand? A few examples: Spell-checking dictionaries Hyphenation tables Fonts for additional scripts If this came with an update

Re: Including Adobe CMaps

2014-02-28 Thread Jonathan Kew
On 28/2/14 11:44, Gervase Markham wrote: On 26/02/14 20:21, Jonathan Kew wrote: Lets turn this question around. If we had an on-demand way to load stuff like this, what else would we want to load on demand? A few examples: Spell-checking dictionaries Hyphenation tables Fonts for additional

Re: Including Adobe CMaps

2014-02-28 Thread Gervase Markham
On 28/02/14 12:37, Jonathan Kew wrote: Presumably we always want the complete PSL available. So it really should be part of the base product, not a [try-to-]load-on-demand resource. I was proposing it be part of the base product, but updated on demand. Isn't it sufficient to update that with

Re: Including Adobe CMaps

2014-02-28 Thread Robert Kaiser
Boris Zbarsky schrieb: On 2/26/14 3:58 PM, Wesley Hardman wrote: Personally, I would prefer to have it already available. We have several deployment targets with different tradeoffs. Broadly speaking: Phones: expensive data, limited storage. Want to not use up the storage, so download

Re: Including Adobe CMaps

2014-02-27 Thread Mike Hommey
On Thu, Feb 27, 2014 at 01:30:58AM +0100, Andreas Gal wrote: Could we compress major parts of omni.ja en block? We could for example stick all JS we load at startup into a zip with zero compression and then compress that into an outer zip. I think we already support nested containers like

Re: Including Adobe CMaps

2014-02-27 Thread Neil
Andreas Gal wrote: Could we compress major parts of omni.ja en block? We could for example stick all JS we load at startup into a zip with zero compression and then compress that into an outer zip. I think we already support nested containers like that. Assuming your math is correct even

Re: Including Adobe CMaps

2014-02-27 Thread Benjamin Smedberg
On 2/26/2014 4:36 PM, Bobby Holley wrote: On Wed, Feb 26, 2014 at 12:58 PM, Wesley Hardman whardma...@gmail.comwrote: It seems like it would be trivial to add a button in the Preferences UI to let people precache all dynamically-loaded data. I don't think that would be trivial. In particular,

Re: Including Adobe CMaps

2014-02-27 Thread Bobby Holley
On Thu, Feb 27, 2014 at 6:27 AM, Benjamin Smedberg benja...@smedbergs.uswrote: On 2/26/2014 4:36 PM, Bobby Holley wrote: On Wed, Feb 26, 2014 at 12:58 PM, Wesley Hardman whardma...@gmail.com wrote: It seems like it would be trivial to add a button in the Preferences UI to let people

Re: Including Adobe CMaps

2014-02-27 Thread Nick Alexander
On 2/27/2014, 12:30 AM, Axel Hecht wrote: The feature of zip we want is the index, that let's us seek to a position in the bundle and start unpacking, just given the filename. How hard is to actually create a datastructure for the same purpose for a tar.xz or so? I don't know really anything

Re: Including Adobe CMaps

2014-02-27 Thread Mike Hommey
On Thu, Feb 27, 2014 at 07:33:39PM +0900, Mike Hommey wrote: On Thu, Feb 27, 2014 at 01:30:58AM +0100, Andreas Gal wrote: Could we compress major parts of omni.ja en block? We could for example stick all JS we load at startup into a zip with zero compression and then compress that into

Re: Including Adobe CMaps

2014-02-26 Thread Brendan Dahl
Yury Delendik worked on reformatting the files a bit and was able to get them down to 1.1MB binary which gzips to 990KB. This seems like a reasonable size to me and involves a lot less work than setting up a process for distributing these files via CDN. Brendan On Feb 24, 2014, at 10:14 PM,

Re: Including Adobe CMaps

2014-02-26 Thread Bobby Holley
That's still a ton for something that most of our users will not (or will rarely) use. I think we absolutely need to get an on-demand story for this kind of stuff. It isn't the first time it has come up. bholley On Wed, Feb 26, 2014 at 11:38 AM, Brendan Dahl bd...@mozilla.com wrote: Yury

Re: Including Adobe CMaps

2014-02-26 Thread Andreas Gal
This randomly reminds me that it might be time to review zip as our compression format for omni.ja. ls -l omni.ja 7862939 ls -l omni.tar.xz (tar and then xz -z) 4814416 LZMA2 is available as a public domain implementation. It uses a bit more memory than zip, but its still within reason

Re: Including Adobe CMaps

2014-02-26 Thread Andreas Gal
Lets turn this question around. If we had an on-demand way to load stuff like this, what else would we want to load on demand? Andreas On Feb 26, 2014, at 8:53 PM, Bobby Holley bobbyhol...@gmail.com wrote: That's still a ton for something that most of our users will not (or will rarely)

Re: Including Adobe CMaps

2014-02-26 Thread Jonathan Kew
On 26/2/14 19:57, Andreas Gal wrote: Lets turn this question around. If we had an on-demand way to load stuff like this, what else would we want to load on demand? A few examples: Spell-checking dictionaries Hyphenation tables Fonts for additional scripts JK Andreas On Feb 26, 2014,

Re: Including Adobe CMaps

2014-02-26 Thread Benjamin Smedberg
On 2/26/2014 3:21 PM, Jonathan Kew wrote: On 26/2/14 19:57, Andreas Gal wrote: Lets turn this question around. If we had an on-demand way to load stuff like this, what else would we want to load on demand? A few examples: Spell-checking dictionaries Hyphenation tables Fonts for additional

Re: Including Adobe CMaps

2014-02-26 Thread Gregory Szorc
https://bugzilla.mozilla.org/show_bug.cgi?id=977292 Assigned to nobody. On 2/26/2014 12:49 PM, Andreas Gal wrote: This sounds like quite an opportunity to shorten download times and reduce CDN load. Who wants to file the bug? :) Andreas On Feb 26, 2014, at 9:44 PM, Benjamin Smedberg

Re: Including Adobe CMaps

2014-02-26 Thread Nick Alexander
On 2/26/2014, 11:56 AM, Andreas Gal wrote: This randomly reminds me that it might be time to review zip as our compression format for omni.ja. ls -l omni.ja 7862939 ls -l omni.tar.xz (tar and then xz -z) 4814416 LZMA2 is available as a public domain implementation. It uses a bit more

Re: Including Adobe CMaps

2014-02-26 Thread Boris Zbarsky
On 2/26/14 3:58 PM, Wesley Hardman wrote: Personally, I would prefer to have it already available. We have several deployment targets with different tradeoffs. Broadly speaking: Phones: expensive data, limited storage. Want to not use up the storage, so download lazily. Consumer

Re: Including Adobe CMaps

2014-02-26 Thread Mike Hommey
On Wed, Feb 26, 2014 at 08:56:37PM +0100, Andreas Gal wrote: This randomly reminds me that it might be time to review zip as our compression format for omni.ja. ls -l omni.ja 7862939 ls -l omni.tar.xz (tar and then xz -z) 4814416 LZMA2 is available as a public domain

Re: Including Adobe CMaps

2014-02-26 Thread Mike Hommey
On Thu, Feb 27, 2014 at 08:25:00AM +0900, Mike Hommey wrote: On Wed, Feb 26, 2014 at 08:56:37PM +0100, Andreas Gal wrote: This randomly reminds me that it might be time to review zip as our compression format for omni.ja. ls -l omni.ja 7862939 ls -l omni.tar.xz (tar and

Re: Including Adobe CMaps

2014-02-26 Thread Andreas Gal
Could we compress major parts of omni.ja en block? We could for example stick all JS we load at startup into a zip with zero compression and then compress that into an outer zip. I think we already support nested containers like that. Assuming your math is correct even without adding LZMA2

Including Adobe CMaps

2014-02-24 Thread Brendan Dahl
PDF.js plans to soon start including and using Adobe CMap files for converting character codes to character id’s(CIDs) and mapping character codes to unicode values. This will fix a number of bugs in PDF.js and will improve our support for Chinese, Korean, and Japanese(CJK) documents. I wanted

Re: Including Adobe CMaps

2014-02-24 Thread Andreas Gal
Is this something we could load dynamically and offline cache? Andreas Sent from Mobile. On Feb 24, 2014, at 23:41, Brendan Dahl bd...@mozilla.com wrote: PDF.js plans to soon start including and using Adobe CMap files for converting character codes to character id's(CIDs) and mapping

Re: Including Adobe CMaps

2014-02-24 Thread Kyle Huey
On Mon, Feb 24, 2014 at 3:01 PM, Andreas Gal andreas@gmail.com wrote: Is this something we could load dynamically and offline cache? Andreas Sent from Mobile. On Feb 24, 2014, at 23:41, Brendan Dahl bd...@mozilla.com wrote: PDF.js plans to soon start including and using Adobe CMap

Re: Including Adobe CMaps

2014-02-24 Thread Ralph Giles
On 2014-02-24 2:41 PM, Brendan Dahl wrote: There are 168 files with an average size of ~40KB, and all of the files together are roughly: 6.9M 2.2M when gzipped IIRC mupdf was able to save significant space by pre-parsing the files. Their code for that is GPL (and oriented toward compiling-in

Re: Including Adobe CMaps

2014-02-24 Thread Rik Cabanier
On Mon, Feb 24, 2014 at 3:01 PM, Andreas Gal andreas@gmail.com wrote: Is this something we could load dynamically and offline cache? That should be possible. The CMap name is in the PDF so Firefox could download it on demand. Also, if the user has acrobat, the CMaps are already on their

Re: Including Adobe CMaps

2014-02-24 Thread Brendan Dahl
It’s certainly possible to load dynamically. Do we currently do this for any other Firefox resources? From what I’ve seen, many PDF’s use CMaps even if they don’t necessarily have CJK characters, so it may just be better to include them. FWIW both Popper and Mupdf embed the CMaps. Brendan

Re: Including Adobe CMaps

2014-02-24 Thread Andreas Gal
My assumption is that certain users only need certain CMaps because they tend to read only documents in certain languages. This seems like something we can really optimize and avoid ahead-of-time download cost for. The fact that we don’t do this yet doesn’t seem like a good criteria. There is