Re: [i18n] jcr resource bundle provider
Hi Am 19.12.2013 um 00:10 schrieb Carsten Ziegeler cziege...@apache.org: shop=ship From a deployer's point of view shop makes a lot of sense, actually. Regards Felix 2013/12/19 Carsten Ziegeler cziege...@apache.org Ok, but I don't want to shop a new i18n implementation which is incompatible and would require manual changes to the content before it can be installed. So either, it automatically updates content or is able to detect which format is used and acts accordingly. Carsten 2013/12/19 Tobias Bocanegra tri...@apache.org I don't think that the migration is straight forward. the way the provider currently works, it would allow message definitions like: /content/de [mix:language] /very/deep/structure/ /hello [sling:Message] + sling:message Hallo. i.e. the messages and be deliberately distributed over the content, where needed. we don't know how i18 support is used in general. Adobe's Granite and CQ (and probably most of they customers) use full subtree dictionaries like the example in [1]. otoh, applications that use compact dicts probably don't have many. and adding a new sling:Dictionary mixin to those 5-10 nodes is no big effort. Regards, Toby On Tue, Dec 17, 2013 at 11:12 PM, Carsten Ziegeler cziege...@apache.org wrote: The bundle can either set a marker in the repository or a file in the bundle private date; the repository is the better place as this can be used in a clustered installation to avoid duplicate or concurrent migration Carsten 2013/12/18 Alexander Klimetschek aklim...@adobe.com On 17.12.2013, at 22:03, Carsten Ziegeler cziege...@apache.org wrote: What about if we add the migration code to the bundle? Hmm, interesting :) Not sure though if we should modify content from such a bundle. And how do we know that we already did the migration and don't run the migration code over and over again on startup (which would do the same slower query and thus not really gain performance)? Cheers, Alex -- Carsten Ziegeler cziege...@apache.org -- Carsten Ziegeler cziege...@apache.org -- Carsten Ziegeler cziege...@apache.org
Re: [i18n] jcr resource bundle provider
Hi I wonder, whether we should not actually radically change the data structure: Move away from the fine grained content and back to File blobs in Properties or maybe JSON format ? Currently the i18n bundle reads all translation entries for the request language in bulk anyway (it used to be read on-demand, but that is way back). Since this is a radical approach, we can have both approaches sit side by side and support both. If the new one is used, performance might be better. If the old one is still use, it still works. Then we can offer a web console plugin (which might be a good idea anyway to, for example, show what language we have, etc.) to ask the bundle migrate the old structure to the new structure. Regards Felix Am 18.12.2013 um 18:35 schrieb Tobias Bocanegra tri...@apache.org: I don't think that the migration is straight forward. the way the provider currently works, it would allow message definitions like: /content/de [mix:language] /very/deep/structure/ /hello [sling:Message] + sling:message Hallo. i.e. the messages and be deliberately distributed over the content, where needed. we don't know how i18 support is used in general. Adobe's Granite and CQ (and probably most of they customers) use full subtree dictionaries like the example in [1]. otoh, applications that use compact dicts probably don't have many. and adding a new sling:Dictionary mixin to those 5-10 nodes is no big effort. Regards, Toby On Tue, Dec 17, 2013 at 11:12 PM, Carsten Ziegeler cziege...@apache.org wrote: The bundle can either set a marker in the repository or a file in the bundle private date; the repository is the better place as this can be used in a clustered installation to avoid duplicate or concurrent migration Carsten 2013/12/18 Alexander Klimetschek aklim...@adobe.com On 17.12.2013, at 22:03, Carsten Ziegeler cziege...@apache.org wrote: What about if we add the migration code to the bundle? Hmm, interesting :) Not sure though if we should modify content from such a bundle. And how do we know that we already did the migration and don't run the migration code over and over again on startup (which would do the same slower query and thus not really gain performance)? Cheers, Alex -- Carsten Ziegeler cziege...@apache.org
Re: [i18n] jcr resource bundle provider
Hi, On Fri, Dec 20, 2013 at 10:27 AM, Felix Meschberger fmesc...@adobe.com wrote: ...Since this is a radical approach, we can have both approaches sit side by side and support both. If the new one is used, performance might be better. If the old one is still use, it still works... How would you handle the case where old and new i18n data is present? I'd suggest handling that per language: if a file is present, node entries of the same language are ignored, with suitable log warnings if such ignored data is present. ...Then we can offer a web console plugin to ask the bundle migrate the old structure to the new structure... Which could mean converting i18n nodes to files data, and move the i18n nodes to an archive location if successful? Sounds good to me. -Bertrand
Re: [i18n] jcr resource bundle provider
Hi Am 20.12.2013 um 11:15 schrieb Bertrand Delacretaz bdelacre...@apache.org: Hi, On Fri, Dec 20, 2013 at 10:27 AM, Felix Meschberger fmesc...@adobe.com wrote: ...Since this is a radical approach, we can have both approaches sit side by side and support both. If the new one is used, performance might be better. If the old one is still use, it still works... How would you handle the case where old and new i18n data is present? I'd suggest handling that per language: if a file is present, node entries of the same language are ignored, with suitable log warnings if such ignored data is present. For example, yes. ...Then we can offer a web console plugin to ask the bundle migrate the old structure to the new structure... Which could mean converting i18n nodes to files data, and move the i18n nodes to an archive location if successful? Sounds good to me. Exactly — or actually just removing the old nodes (might be configurable) Regards Felix
Re: [i18n] jcr resource bundle provider
Hi, On Fri, Dec 20, 2013 at 1:27 AM, Felix Meschberger fmesc...@adobe.com wrote: Hi I wonder, whether we should not actually radically change the data structure: Move away from the fine grained content and back to File blobs in Properties or maybe JSON format ? that would certainly be a good option that also helps reducing the overhead we have today managing/installing/deploying large dictionaries. so something like: /en.json [sling:Dictionary, nt:file] /jcr:content or better /en.json [nt:file] /jcr:content [nt:resource, sling:DictionaryResource] - sling:language - jcr:data or /en [sling:DictionaryFolder, sling:Folder] /dict1.json [nt:file] /dict2.json [nt:file] however, there are currently tools and UIs developed on top of the existing content based solution which assume the fine grained storage. so changing the structure would also impact those. Currently the i18n bundle reads all translation entries for the request language in bulk anyway (it used to be read on-demand, but that is way back). Since this is a radical approach, we can have both approaches sit side by side and support both. If the new one is used, performance might be better. If the old one is still use, it still works. So this would be a new resource bundle provider - and over time, we can disabled / remove the old one. or would you make the existing one support the new storage model? Then we can offer a web console plugin (which might be a good idea anyway to, for example, show what language we have, etc.) to ask the bundle migrate the old structure to the new structure. If we use a JSON for the new storage model, it should be as easy as to request /libs/i18n/en.2.json In any case, my proposed patch in SLING-2881 at least solves the overly frequent flushing of the resource bundle cache for the current storage model. it would be good if a sling committer could look at it. thanks. regards, toby Am 18.12.2013 um 18:35 schrieb Tobias Bocanegra tri...@apache.org: I don't think that the migration is straight forward. the way the provider currently works, it would allow message definitions like: /content/de [mix:language] /very/deep/structure/ /hello [sling:Message] + sling:message Hallo. i.e. the messages and be deliberately distributed over the content, where needed. we don't know how i18 support is used in general. Adobe's Granite and CQ (and probably most of they customers) use full subtree dictionaries like the example in [1]. otoh, applications that use compact dicts probably don't have many. and adding a new sling:Dictionary mixin to those 5-10 nodes is no big effort. Regards, Toby On Tue, Dec 17, 2013 at 11:12 PM, Carsten Ziegeler cziege...@apache.org wrote: The bundle can either set a marker in the repository or a file in the bundle private date; the repository is the better place as this can be used in a clustered installation to avoid duplicate or concurrent migration Carsten 2013/12/18 Alexander Klimetschek aklim...@adobe.com On 17.12.2013, at 22:03, Carsten Ziegeler cziege...@apache.org wrote: What about if we add the migration code to the bundle? Hmm, interesting :) Not sure though if we should modify content from such a bundle. And how do we know that we already did the migration and don't run the migration code over and over again on startup (which would do the same slower query and thus not really gain performance)? Cheers, Alex -- Carsten Ziegeler cziege...@apache.org
Re: [i18n] jcr resource bundle provider
On 20.12.2013, at 10:08, Tobias Bocanegra tri...@apache.org wrote: however, there are currently tools and UIs developed on top of the existing content based solution which assume the fine grained storage. so changing the structure would also impact those. Yes, having written a few of those tools (all propietary) I can say that this would require a lot of changes in that area. For example, we use jcr filevault (which was just recently open sourced at Jackrabbit) and use that to bring in our translations, that come through xliff files from the translators, and merge the vault xml files with the updates from the xliff files. It's not impossible to adapt them, but it is considerable effort. So this would be a new resource bundle provider - and over time, we can disabled / remove the old one. or would you make the existing one support the new storage model? A new one sounds better IMO. Then we can offer a web console plugin (which might be a good idea anyway to, for example, show what language we have, etc.) to ask the bundle migrate the old structure to the new structure. If we use a JSON for the new storage model, it should be as easy as to request /libs/i18n/en.2.json Note that there is a difference between a single dictionary at some location and the JcrResourceBundle which merges all of them into a single big view. In any case, my proposed patch in SLING-2881 at least solves the overly frequent flushing of the resource bundle cache for the current storage model. it would be good if a sling committer could look at it. Right, this should be fixed separately from any further solution that requires content migrations. Cheers, Alex
Re: [i18n] jcr resource bundle provider
On 18.12.2013, at 21:44, Tobias Bocanegra tri...@apache.org wrote: I don't really like the automatic upgrade, since we really don't know how to differentiate use of mix:language/sling:message in compact subtrees vs the sparse case. It might be to expensive to traverse the mix:language nodes for each bundle activation. We don't need a traversal - just that one query we currently already do and which we want to replace because it is potentially a bit slow. And it's only on the _first_ bundle activation, afterwards the has been migrated flag is set. The nested mix:language scenario should be easy to handle by just looking at the paths: + mix:language ... + mix:language ... + sling:Message If you get nested paths, you only look at the longer one. The only awful scenario could be this: + mix:language ... + sling:Message + mix:language ... + sling:Message But it doesn't make any sense, so I can't imagine anyone using it. I would rather have a semi automatic fallback as described before: 0. read the operation-mode osgi config property (default: auto). 1. if mode==auto: 1.1 search (query) for all sling:Dictionary nodes 1.2. if non found, assume old content and set mode=legacy That only allows one version at the same time. If we update the ootb dictionaries in our application to use the sling:dictionary node types, but customers have their own dictionaries, these would not work anymore after upgrading. Cheers, Alex
Re: [i18n] jcr resource bundle provider
I don't think that the migration is straight forward. the way the provider currently works, it would allow message definitions like: /content/de [mix:language] /very/deep/structure/ /hello [sling:Message] + sling:message Hallo. i.e. the messages and be deliberately distributed over the content, where needed. we don't know how i18 support is used in general. Adobe's Granite and CQ (and probably most of they customers) use full subtree dictionaries like the example in [1]. otoh, applications that use compact dicts probably don't have many. and adding a new sling:Dictionary mixin to those 5-10 nodes is no big effort. Regards, Toby On Tue, Dec 17, 2013 at 11:12 PM, Carsten Ziegeler cziege...@apache.org wrote: The bundle can either set a marker in the repository or a file in the bundle private date; the repository is the better place as this can be used in a clustered installation to avoid duplicate or concurrent migration Carsten 2013/12/18 Alexander Klimetschek aklim...@adobe.com On 17.12.2013, at 22:03, Carsten Ziegeler cziege...@apache.org wrote: What about if we add the migration code to the bundle? Hmm, interesting :) Not sure though if we should modify content from such a bundle. And how do we know that we already did the migration and don't run the migration code over and over again on startup (which would do the same slower query and thus not really gain performance)? Cheers, Alex -- Carsten Ziegeler cziege...@apache.org
Re: [i18n] jcr resource bundle provider
shop=ship 2013/12/19 Carsten Ziegeler cziege...@apache.org Ok, but I don't want to shop a new i18n implementation which is incompatible and would require manual changes to the content before it can be installed. So either, it automatically updates content or is able to detect which format is used and acts accordingly. Carsten 2013/12/19 Tobias Bocanegra tri...@apache.org I don't think that the migration is straight forward. the way the provider currently works, it would allow message definitions like: /content/de [mix:language] /very/deep/structure/ /hello [sling:Message] + sling:message Hallo. i.e. the messages and be deliberately distributed over the content, where needed. we don't know how i18 support is used in general. Adobe's Granite and CQ (and probably most of they customers) use full subtree dictionaries like the example in [1]. otoh, applications that use compact dicts probably don't have many. and adding a new sling:Dictionary mixin to those 5-10 nodes is no big effort. Regards, Toby On Tue, Dec 17, 2013 at 11:12 PM, Carsten Ziegeler cziege...@apache.org wrote: The bundle can either set a marker in the repository or a file in the bundle private date; the repository is the better place as this can be used in a clustered installation to avoid duplicate or concurrent migration Carsten 2013/12/18 Alexander Klimetschek aklim...@adobe.com On 17.12.2013, at 22:03, Carsten Ziegeler cziege...@apache.org wrote: What about if we add the migration code to the bundle? Hmm, interesting :) Not sure though if we should modify content from such a bundle. And how do we know that we already did the migration and don't run the migration code over and over again on startup (which would do the same slower query and thus not really gain performance)? Cheers, Alex -- Carsten Ziegeler cziege...@apache.org -- Carsten Ziegeler cziege...@apache.org -- Carsten Ziegeler cziege...@apache.org
Re: [i18n] jcr resource bundle provider
On 17.12.2013, at 23:12, Carsten Ziegeler cziege...@apache.org wrote: The bundle can either set a marker in the repository That's probably something we should avoid. The question is where? And why? or a file in the bundle private date; Sounds better. the repository is the better place as this can be used in a clustered installation to avoid duplicate or concurrent migration The migration must be idempotent and easily will be: it finds the dictionary nodes that only have mix:language but no sling:dictionary yet. If these all have a sling:dictionary it simply does nothing and sets the migrated flag. Thus this will run on startup for each cluster node, is a single query as we have now, and doesn't cost anything. Better yet, it runs lazy on the first request access, thus it should avoid concurrent migrations quite well. Anyway, these can be handled by retrying when a ItemModifiedException comes up (or whatever the exception is called). Cheers, Alex
Re: [i18n] jcr resource bundle provider
I don't really like the automatic upgrade, since we really don't know how to differentiate use of mix:language/sling:message in compact subtrees vs the sparse case. It might be to expensive to traverse the mix:language nodes for each bundle activation. we could do some heuristic and only look at the first level and if there are 100% sling:message nodes assume that it's a compact dictionary. in any case, you would need to make this check for each activation. I would rather have a semi automatic fallback as described before: 0. read the operation-mode osgi config property (default: auto). 1. if mode==auto: 1.1 search (query) for all sling:Dictionary nodes 1.2. if non found, assume old content and set mode=legacy 2. if mode==legacy, do the current behavior by querying all messages below the language roots 3. if mode==modern, query all dictionary roots (or use result from 1.1 if available) and traverse subtrees when resource bundles are loaded regards, toby On Wed, Dec 18, 2013 at 6:38 PM, Alexander Klimetschek aklim...@adobe.com wrote: On 17.12.2013, at 23:12, Carsten Ziegeler cziege...@apache.org wrote: The bundle can either set a marker in the repository That's probably something we should avoid. The question is where? And why? or a file in the bundle private date; Sounds better. the repository is the better place as this can be used in a clustered installation to avoid duplicate or concurrent migration The migration must be idempotent and easily will be: it finds the dictionary nodes that only have mix:language but no sling:dictionary yet. If these all have a sling:dictionary it simply does nothing and sets the migrated flag. Thus this will run on startup for each cluster node, is a single query as we have now, and doesn't cost anything. Better yet, it runs lazy on the first request access, thus it should avoid concurrent migrations quite well. Anyway, these can be handled by retrying when a ItemModifiedException comes up (or whatever the exception is called). Cheers, Alex
[i18n] jcr resource bundle provider
Hi, I was looking at SLING-2881 [0] and reading the docu at [1]. the i18n code has 1 queries, one is: 1) //element(*,mix:language) and the other is 2) //element(*,mix:language)[@jcr:language='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message) the first one is to find all language bundle roots, and the 2nd is used to actually load all messages for 1 bundle. I suggest to replace the second one by a manual traversal, since all messages are usually just directly below the language nodes. WDYT? Regards, Toby [0] https://issues.apache.org/jira/browse/SLING-2881 [1] http://sling.apache.org/site/internationalization-support.html
[i18n] jcr resource bundle provider
Hi, I was looking at SLING-2881 [0] and reading the docu at [1]. the i18n code has 1 queries, one is: 1) //element(*,mix:language) and the other is 2) //element(*,mix:language)[@jcr:language='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message) the first one is to find all language bundle roots, and the 2nd is used to actually load all messages for 1 bundle. I suggest to replace the second one by a manual traversal, since all messages are usually just directly below the language nodes. WDYT? Regards, Toby [0] https://issues.apache.org/jira/browse/SLING-2881 [1] http://sling.apache.org/site/internationalization-support.html
Re: [i18n] jcr resource bundle provider
On 17.12.2013, at 14:05, Tobias Bocanegra tri...@apache.org wrote: I was looking at SLING-2881 [0] and reading the docu at [1]. the i18n code has 1 queries, one is: 1) //element(*,mix:language) Unfortunately this is too broad, mix:language can be many nodes, depending on the application. That's why we effectively are using the query below only which makes it more specific again by looking at mix:language nodes that have a sling:Message node. The mistake we probably made was to not introduce a dictionary base node type. The problem manifests in the observation which gets triggered on mix:language and is too broad (SLING-2881). Although I already made a proposal how we can solve SLING-2881 without having to change the content structure (see issue). 2) //element(*,mix:language)[@jcr:language='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message) the first one is to find all language bundle roots, and the 2nd is used to actually load all messages for 1 bundle. I suggest to replace the second one by a manual traversal, since all messages are usually just directly below the language nodes. Yes, it was just a bit simpler to implement using a query, since it also needs to support deep structures (we used them earlier, so sling i18n supports it). But even for that, a traversal using the javax.jcr.util.TraversingItemVisitor should be trivial. However, we might be iterating way too much if we start on all mix:language nodes. So I am not sure if we have to solve the performance of this query. If we fix the observation, this query will run only whenever a dictionary is updated or when the system starts. Updating dictionaries only happens very rarely in applications. Cheers, Alex
Re: [i18n] jcr resource bundle provider
+1 I like the dictionary approach better as it reduces the amount of complex queries. for backward compatibility the resource bundle provider should initially do a query for the new dictionaries and if found, go into 'dictionary' mode. if not, it would fall back to the old mix:language approach. to force loading old mix:language bundles could also be enabled with an osgi config. regards, toby On Tue, Dec 17, 2013 at 3:47 PM, Alexander Klimetschek aklim...@adobe.com wrote: On 17.12.2013, at 14:05, Tobias Bocanegra tri...@apache.org wrote: I was looking at SLING-2881 [0] and reading the docu at [1]. the i18n code has 1 queries, one is: 1) //element(*,mix:language) Unfortunately this is too broad, mix:language can be many nodes, depending on the application. That's why we effectively are using the query below only which makes it more specific again by looking at mix:language nodes that have a sling:Message node. The mistake we probably made was to not introduce a dictionary base node type. The problem manifests in the observation which gets triggered on mix:language and is too broad (SLING-2881). Although I already made a proposal how we can solve SLING-2881 without having to change the content structure (see issue). 2) //element(*,mix:language)[@jcr:language='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message) the first one is to find all language bundle roots, and the 2nd is used to actually load all messages for 1 bundle. I suggest to replace the second one by a manual traversal, since all messages are usually just directly below the language nodes. Yes, it was just a bit simpler to implement using a query, since it also needs to support deep structures (we used them earlier, so sling i18n supports it). But even for that, a traversal using the javax.jcr.util.TraversingItemVisitor should be trivial. However, we might be iterating way too much if we start on all mix:language nodes. So I am not sure if we have to solve the performance of this query. If we fix the observation, this query will run only whenever a dictionary is updated or when the system starts. Updating dictionaries only happens very rarely in applications. Cheers, Alex
Re: [i18n] jcr resource bundle provider
On 17.12.2013, at 17:03, Tobias Bocanegra tri...@apache.org wrote: +1 I like the dictionary approach better as it reduces the amount of complex queries. for backward compatibility the resource bundle provider should initially do a query for the new dictionaries and if found, go into 'dictionary' mode. if not, it would fall back to the old mix:language approach. to force loading old mix:language bundles could also be enabled with an osgi config. We discussed this f2f and it seems ok to force users to migrate their content when upgrading: simply because the content change would be so simple, it is only the addition of a new sling:dictionary mixin for every dictionary. This could be done fairly easily in a content migration script for applications. The benefits are not having to care about the future of this flag and a simpler implementation. WDYT? Cheers, Alex
Re: [i18n] jcr resource bundle provider
What about if we add the migration code to the bundle? 2013/12/18 Alexander Klimetschek aklim...@adobe.com On 17.12.2013, at 17:03, Tobias Bocanegra tri...@apache.org wrote: +1 I like the dictionary approach better as it reduces the amount of complex queries. for backward compatibility the resource bundle provider should initially do a query for the new dictionaries and if found, go into 'dictionary' mode. if not, it would fall back to the old mix:language approach. to force loading old mix:language bundles could also be enabled with an osgi config. We discussed this f2f and it seems ok to force users to migrate their content when upgrading: simply because the content change would be so simple, it is only the addition of a new sling:dictionary mixin for every dictionary. This could be done fairly easily in a content migration script for applications. The benefits are not having to care about the future of this flag and a simpler implementation. WDYT? Cheers, Alex -- Carsten Ziegeler cziege...@apache.org
Re: [i18n] jcr resource bundle provider
On 17.12.2013, at 22:03, Carsten Ziegeler cziege...@apache.org wrote: What about if we add the migration code to the bundle? Hmm, interesting :) Not sure though if we should modify content from such a bundle. And how do we know that we already did the migration and don't run the migration code over and over again on startup (which would do the same slower query and thus not really gain performance)? Cheers, Alex
Re: [i18n] jcr resource bundle provider
The bundle can either set a marker in the repository or a file in the bundle private date; the repository is the better place as this can be used in a clustered installation to avoid duplicate or concurrent migration Carsten 2013/12/18 Alexander Klimetschek aklim...@adobe.com On 17.12.2013, at 22:03, Carsten Ziegeler cziege...@apache.org wrote: What about if we add the migration code to the bundle? Hmm, interesting :) Not sure though if we should modify content from such a bundle. And how do we know that we already did the migration and don't run the migration code over and over again on startup (which would do the same slower query and thus not really gain performance)? Cheers, Alex -- Carsten Ziegeler cziege...@apache.org