Re: impersonation and application path
+1 for proposal. Can we make new behaviour of writing to users own directory as default? Most probably users will upgrade gateway with apex-core. If not, they always have option to set the flag and fall back to legacy behaviour. -Priyanka On Fri, May 19, 2017 at 7:52 AM, Chinmay Kolhatkarwrote: > +1 for pramod's proposal. > > On 19-May-2017 4:51 AM, "Sanjay Pujare" wrote: > > > +1 for Pramod's proposal for impersonation. > > > > I have an issue with Sandesh's suggestion about making the new behavior > as > > the default (or only) behavior. This will introduce incompatibility with > > other legacy tools (e.g. Datatorrent's dtGateway) that assume user A's > HDFS > > path as the application path. Because the legacy tools will continue to > > assume the old path (user A's path) they will not work with the Apex core > > that has this change. > > > > The current behavior might also be preferable to certain users or their > > administrators because of not having to deal with multiple HDFS user > > directories (for administration, logging, backup etc). > > > > On Thu, May 18, 2017 at 4:01 PM, Sandesh Hegde > > wrote: > > > > > My vote is to make the new proposal as the default behavior. Is there a > > use > > > case for the current behavior? If not then no need to add the > > configuration > > > setting. > > > > > > On Thu, May 18, 2017 at 3:47 PM Pramod Immaneni < > pra...@datatorrent.com> > > > wrote: > > > > > > > Sorry typo in sentence "as we are not asking for permissions for a > > lower > > > > privilege", please read as "as we are now asking for permissions for > a > > > > lower privilege". > > > > > > > > On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni < > > pra...@datatorrent.com > > > > > > > > wrote: > > > > > > > > > Apex cli supports impersonation in secure mode. With impersonation, > > the > > > > > user running the cli or the user authenticating with hadoop > > (henceforth > > > > > referred to as login user) can be different from the effective user > > > with > > > > > which the actions are performed under hadoop. An example for this > is > > an > > > > > application can be launched by user A to run in hadoop as user B. > > This > > > is > > > > > kind of like the sudo functionality in unix. You can find more > > details > > > > > about the functionalilty here > > > > https://apex.apache.org/docs/apex/security/ in > > > > > the Impersonation section. > > > > > > > > > > What happens today with launching an application with > impersonation, > > > > using > > > > > the above launch example, is that even though the application runs > as > > > > user > > > > > B it still uses user A's hdfs path for the application path. The > > > > > application path is where the artifacts necessary to run the > > > application > > > > > are stored and where the runtime files like checkpoints are stored. > > > This > > > > > means that user B needs to have read and write access to user A's > > > > > application path folders. > > > > > > > > > > This may not be allowed in certain environments as it may be a > policy > > > > > violation for the following reason. Because user A is able to > > > impersonate > > > > > as user B to launch the application, A is considered to be a higher > > > > > privileged user than B and is given necessary privileges in hadoop > to > > > do > > > > > so. But after launch B needs to access folders belonging to A which > > > could > > > > > constitute a violation as we are not asking for permissions for a > > lower > > > > > privilege user to access resources of a higher privilege user. > > > > > > > > > > I would like to propose adding a configuration setting, which when > > set > > > > > will use the application path in the impersonated user's home > > directory > > > > > (user B) as opposed to impersonating user's home directory (user > A). > > If > > > > > this setting is not specified then the behavior can default to what > > it > > > is > > > > > today for backwards compatibility. > > > > > > > > > > Comments, suggestions, concerns? > > > > > > > > > > Thanks > > > > > > > > > > > > > > >
[jira] [Commented] (APEXCORE-724) Support for Kubernetes
[ https://issues.apache.org/jira/browse/APEXCORE-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016862#comment-16016862 ] Deepak Narkhede commented on APEXCORE-724: -- Hi Thomas, I would like to contribute to this feature. Also have done some investigation earlier using kubernetes client api and pods (single or multi-docker containers) on kubernertes. Thanks, Deepak > Support for Kubernetes > -- > > Key: APEXCORE-724 > URL: https://issues.apache.org/jira/browse/APEXCORE-724 > Project: Apache Apex Core > Issue Type: New Feature >Reporter: Thomas Weise > Labels: roadmap > > It should be possible to run Apex applications on Kubernetes. This will also > require that Apex applications can be packaged as containers (Docker or other > supported container). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (APEXCORE-724) Support for Kubernetes
Thomas Weise created APEXCORE-724: - Summary: Support for Kubernetes Key: APEXCORE-724 URL: https://issues.apache.org/jira/browse/APEXCORE-724 Project: Apache Apex Core Issue Type: New Feature Reporter: Thomas Weise It should be possible to run Apex applications on Kubernetes. This will also require that Apex applications can be packaged as containers (Docker or other supported container). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (APEXMALHAR-2495) Apex SQL: Add support for windowing
[ https://issues.apache.org/jira/browse/APEXMALHAR-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated APEXMALHAR-2495: - Labels: roadmap (was: ) > Apex SQL: Add support for windowing > --- > > Key: APEXMALHAR-2495 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2495 > Project: Apache Apex Malhar > Issue Type: New Feature >Reporter: Thomas Weise > Labels: roadmap > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (APEXMALHAR-2296) Apex SQL: Add support for SQL GROUP BY (Aggregate RelNode)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated APEXMALHAR-2296: - Labels: (was: roadmap) > Apex SQL: Add support for SQL GROUP BY (Aggregate RelNode) > -- > > Key: APEXMALHAR-2296 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2296 > Project: Apache Apex Malhar > Issue Type: New Feature > Components: sql >Reporter: Chinmay Kolhatkar > > Add support for SQL GROUP BY (Aggregate RelNode) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (APEXMALHAR-2296) Apex SQL: Add support for SQL GROUP BY (Aggregate RelNode)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated APEXMALHAR-2296: - Labels: roadmap (was: ) > Apex SQL: Add support for SQL GROUP BY (Aggregate RelNode) > -- > > Key: APEXMALHAR-2296 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2296 > Project: Apache Apex Malhar > Issue Type: New Feature > Components: sql >Reporter: Chinmay Kolhatkar > Labels: roadmap > > Add support for SQL GROUP BY (Aggregate RelNode) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (APEXMALHAR-2495) Apex SQL: Add support for windowing
Thomas Weise created APEXMALHAR-2495: Summary: Apex SQL: Add support for windowing Key: APEXMALHAR-2495 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2495 Project: Apache Apex Malhar Issue Type: New Feature Reporter: Thomas Weise -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Closed] (APEXCORE-721) Announcement section on the website is not uptodate
[ https://issues.apache.org/jira/browse/APEXCORE-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise closed APEXCORE-721. - Resolution: Done > Announcement section on the website is not uptodate > --- > > Key: APEXCORE-721 > URL: https://issues.apache.org/jira/browse/APEXCORE-721 > Project: Apache Apex Core > Issue Type: Bug > Components: Website >Reporter: Pramod Immaneni >Assignee: Pramod Immaneni > > Announcement section on the main page on the website still lists malhar 3.6.0 > and core 3.5.0 as the latest releases. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] apex-site pull request #75: APEXCORE-721 Updated announcements to malhar 3.7...
Github user asfgit closed the pull request at: https://github.com/apache/apex-site/pull/75 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXCORE-721) Announcement section on the website is not uptodate
[ https://issues.apache.org/jira/browse/APEXCORE-721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016809#comment-16016809 ] ASF GitHub Bot commented on APEXCORE-721: - Github user asfgit closed the pull request at: https://github.com/apache/apex-site/pull/75 > Announcement section on the website is not uptodate > --- > > Key: APEXCORE-721 > URL: https://issues.apache.org/jira/browse/APEXCORE-721 > Project: Apache Apex Core > Issue Type: Bug > Components: Website >Reporter: Pramod Immaneni >Assignee: Pramod Immaneni > > Announcement section on the main page on the website still lists malhar 3.6.0 > and core 3.5.0 as the latest releases. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: impersonation and application path
+1 for pramod's proposal. On 19-May-2017 4:51 AM, "Sanjay Pujare"wrote: > +1 for Pramod's proposal for impersonation. > > I have an issue with Sandesh's suggestion about making the new behavior as > the default (or only) behavior. This will introduce incompatibility with > other legacy tools (e.g. Datatorrent's dtGateway) that assume user A's HDFS > path as the application path. Because the legacy tools will continue to > assume the old path (user A's path) they will not work with the Apex core > that has this change. > > The current behavior might also be preferable to certain users or their > administrators because of not having to deal with multiple HDFS user > directories (for administration, logging, backup etc). > > On Thu, May 18, 2017 at 4:01 PM, Sandesh Hegde > wrote: > > > My vote is to make the new proposal as the default behavior. Is there a > use > > case for the current behavior? If not then no need to add the > configuration > > setting. > > > > On Thu, May 18, 2017 at 3:47 PM Pramod Immaneni > > wrote: > > > > > Sorry typo in sentence "as we are not asking for permissions for a > lower > > > privilege", please read as "as we are now asking for permissions for a > > > lower privilege". > > > > > > On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni < > pra...@datatorrent.com > > > > > > wrote: > > > > > > > Apex cli supports impersonation in secure mode. With impersonation, > the > > > > user running the cli or the user authenticating with hadoop > (henceforth > > > > referred to as login user) can be different from the effective user > > with > > > > which the actions are performed under hadoop. An example for this is > an > > > > application can be launched by user A to run in hadoop as user B. > This > > is > > > > kind of like the sudo functionality in unix. You can find more > details > > > > about the functionalilty here > > > https://apex.apache.org/docs/apex/security/ in > > > > the Impersonation section. > > > > > > > > What happens today with launching an application with impersonation, > > > using > > > > the above launch example, is that even though the application runs as > > > user > > > > B it still uses user A's hdfs path for the application path. The > > > > application path is where the artifacts necessary to run the > > application > > > > are stored and where the runtime files like checkpoints are stored. > > This > > > > means that user B needs to have read and write access to user A's > > > > application path folders. > > > > > > > > This may not be allowed in certain environments as it may be a policy > > > > violation for the following reason. Because user A is able to > > impersonate > > > > as user B to launch the application, A is considered to be a higher > > > > privileged user than B and is given necessary privileges in hadoop to > > do > > > > so. But after launch B needs to access folders belonging to A which > > could > > > > constitute a violation as we are not asking for permissions for a > lower > > > > privilege user to access resources of a higher privilege user. > > > > > > > > I would like to propose adding a configuration setting, which when > set > > > > will use the application path in the impersonated user's home > directory > > > > (user B) as opposed to impersonating user's home directory (user A). > If > > > > this setting is not specified then the behavior can default to what > it > > is > > > > today for backwards compatibility. > > > > > > > > Comments, suggestions, concerns? > > > > > > > > Thanks > > > > > > > > > >
Re: impersonation and application path
+1 for Pramod's proposal for impersonation. I have an issue with Sandesh's suggestion about making the new behavior as the default (or only) behavior. This will introduce incompatibility with other legacy tools (e.g. Datatorrent's dtGateway) that assume user A's HDFS path as the application path. Because the legacy tools will continue to assume the old path (user A's path) they will not work with the Apex core that has this change. The current behavior might also be preferable to certain users or their administrators because of not having to deal with multiple HDFS user directories (for administration, logging, backup etc). On Thu, May 18, 2017 at 4:01 PM, Sandesh Hegdewrote: > My vote is to make the new proposal as the default behavior. Is there a use > case for the current behavior? If not then no need to add the configuration > setting. > > On Thu, May 18, 2017 at 3:47 PM Pramod Immaneni > wrote: > > > Sorry typo in sentence "as we are not asking for permissions for a lower > > privilege", please read as "as we are now asking for permissions for a > > lower privilege". > > > > On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni > > > wrote: > > > > > Apex cli supports impersonation in secure mode. With impersonation, the > > > user running the cli or the user authenticating with hadoop (henceforth > > > referred to as login user) can be different from the effective user > with > > > which the actions are performed under hadoop. An example for this is an > > > application can be launched by user A to run in hadoop as user B. This > is > > > kind of like the sudo functionality in unix. You can find more details > > > about the functionalilty here > > https://apex.apache.org/docs/apex/security/ in > > > the Impersonation section. > > > > > > What happens today with launching an application with impersonation, > > using > > > the above launch example, is that even though the application runs as > > user > > > B it still uses user A's hdfs path for the application path. The > > > application path is where the artifacts necessary to run the > application > > > are stored and where the runtime files like checkpoints are stored. > This > > > means that user B needs to have read and write access to user A's > > > application path folders. > > > > > > This may not be allowed in certain environments as it may be a policy > > > violation for the following reason. Because user A is able to > impersonate > > > as user B to launch the application, A is considered to be a higher > > > privileged user than B and is given necessary privileges in hadoop to > do > > > so. But after launch B needs to access folders belonging to A which > could > > > constitute a violation as we are not asking for permissions for a lower > > > privilege user to access resources of a higher privilege user. > > > > > > I would like to propose adding a configuration setting, which when set > > > will use the application path in the impersonated user's home directory > > > (user B) as opposed to impersonating user's home directory (user A). If > > > this setting is not specified then the behavior can default to what it > is > > > today for backwards compatibility. > > > > > > Comments, suggestions, concerns? > > > > > > Thanks > > > > > >
Re: impersonation and application path
My vote is to make the new proposal as the default behavior. Is there a use case for the current behavior? If not then no need to add the configuration setting. On Thu, May 18, 2017 at 3:47 PM Pramod Immaneniwrote: > Sorry typo in sentence "as we are not asking for permissions for a lower > privilege", please read as "as we are now asking for permissions for a > lower privilege". > > On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni > wrote: > > > Apex cli supports impersonation in secure mode. With impersonation, the > > user running the cli or the user authenticating with hadoop (henceforth > > referred to as login user) can be different from the effective user with > > which the actions are performed under hadoop. An example for this is an > > application can be launched by user A to run in hadoop as user B. This is > > kind of like the sudo functionality in unix. You can find more details > > about the functionalilty here > https://apex.apache.org/docs/apex/security/ in > > the Impersonation section. > > > > What happens today with launching an application with impersonation, > using > > the above launch example, is that even though the application runs as > user > > B it still uses user A's hdfs path for the application path. The > > application path is where the artifacts necessary to run the application > > are stored and where the runtime files like checkpoints are stored. This > > means that user B needs to have read and write access to user A's > > application path folders. > > > > This may not be allowed in certain environments as it may be a policy > > violation for the following reason. Because user A is able to impersonate > > as user B to launch the application, A is considered to be a higher > > privileged user than B and is given necessary privileges in hadoop to do > > so. But after launch B needs to access folders belonging to A which could > > constitute a violation as we are not asking for permissions for a lower > > privilege user to access resources of a higher privilege user. > > > > I would like to propose adding a configuration setting, which when set > > will use the application path in the impersonated user's home directory > > (user B) as opposed to impersonating user's home directory (user A). If > > this setting is not specified then the behavior can default to what it is > > today for backwards compatibility. > > > > Comments, suggestions, concerns? > > > > Thanks > > >
Re: impersonation and application path
Sorry typo in sentence "as we are not asking for permissions for a lower privilege", please read as "as we are now asking for permissions for a lower privilege". On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneniwrote: > Apex cli supports impersonation in secure mode. With impersonation, the > user running the cli or the user authenticating with hadoop (henceforth > referred to as login user) can be different from the effective user with > which the actions are performed under hadoop. An example for this is an > application can be launched by user A to run in hadoop as user B. This is > kind of like the sudo functionality in unix. You can find more details > about the functionalilty here https://apex.apache.org/docs/apex/security/ in > the Impersonation section. > > What happens today with launching an application with impersonation, using > the above launch example, is that even though the application runs as user > B it still uses user A's hdfs path for the application path. The > application path is where the artifacts necessary to run the application > are stored and where the runtime files like checkpoints are stored. This > means that user B needs to have read and write access to user A's > application path folders. > > This may not be allowed in certain environments as it may be a policy > violation for the following reason. Because user A is able to impersonate > as user B to launch the application, A is considered to be a higher > privileged user than B and is given necessary privileges in hadoop to do > so. But after launch B needs to access folders belonging to A which could > constitute a violation as we are not asking for permissions for a lower > privilege user to access resources of a higher privilege user. > > I would like to propose adding a configuration setting, which when set > will use the application path in the impersonated user's home directory > (user B) as opposed to impersonating user's home directory (user A). If > this setting is not specified then the behavior can default to what it is > today for backwards compatibility. > > Comments, suggestions, concerns? > > Thanks >
impersonation and application path
Apex cli supports impersonation in secure mode. With impersonation, the user running the cli or the user authenticating with hadoop (henceforth referred to as login user) can be different from the effective user with which the actions are performed under hadoop. An example for this is an application can be launched by user A to run in hadoop as user B. This is kind of like the sudo functionality in unix. You can find more details about the functionalilty here https://apex.apache.org/docs/apex/security/ in the Impersonation section. What happens today with launching an application with impersonation, using the above launch example, is that even though the application runs as user B it still uses user A's hdfs path for the application path. The application path is where the artifacts necessary to run the application are stored and where the runtime files like checkpoints are stored. This means that user B needs to have read and write access to user A's application path folders. This may not be allowed in certain environments as it may be a policy violation for the following reason. Because user A is able to impersonate as user B to launch the application, A is considered to be a higher privileged user than B and is given necessary privileges in hadoop to do so. But after launch B needs to access folders belonging to A which could constitute a violation as we are not asking for permissions for a lower privilege user to access resources of a higher privilege user. I would like to propose adding a configuration setting, which when set will use the application path in the impersonated user's home directory (user B) as opposed to impersonating user's home directory (user A). If this setting is not specified then the behavior can default to what it is today for backwards compatibility. Comments, suggestions, concerns? Thanks
[GitHub] apex-malhar pull request #604: Apexmalhar 2467 Move JMS related examples fro...
Github user prasannapramod closed the pull request at: https://github.com/apache/apex-malhar/pull/604 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #604: Apexmalhar 2467 Move JMS related examples fro...
GitHub user prasannapramod reopened a pull request: https://github.com/apache/apex-malhar/pull/604 Apexmalhar 2467 Move JMS related examples from datatorrent examples to apex-malhar examples @amberarrow @tweise @ashwinchandrap please review You can merge this pull request into a Git repository by running: $ git pull https://github.com/prasannapramod/apex-malhar APEXMALHAR-2467 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/604.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #604 commit 5ed84d5284777a3a842003bb5e840ac39eed7d94 Author: Sanjay PujareDate: 2017-04-04T22:42:26Z New JMS ActiveMQ example to remove duplicate jdbcIngest line. commit a00f26be4aa74df455bde754c343b0610f0a200c Author: Sanjay Pujare Date: 2017-04-04T22:45:39Z SPOI-8863 New example for using jmsInput operator for reading from SQS use elasticmq jar for unit testing commit a77a67b915f2e3e4df627c4b7232e22aecb4bf61 Author: Lakshmi Prasanna Velineni Date: 2017-04-04T23:06:18Z Changes completed. commit 4684ccf65ec0d9dec50f922fc000f9ebebdffa3c Author: Apex Dev Date: 2017-04-13T21:35:36Z License Headers and checkstyle. commit 4c148f45123b5ef96e786f8dcb02f0b7ba966347 Author: Apex Dev Date: 2017-04-13T21:35:36Z License Headers and checkstyle. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (APEXMALHAR-2475) CacheStore needn't expire data if it read-only data
[ https://issues.apache.org/jira/browse/APEXMALHAR-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pramod Immaneni resolved APEXMALHAR-2475. - Resolution: Fixed Fix Version/s: 3.8.0 > CacheStore needn't expire data if it read-only data > --- > > Key: APEXMALHAR-2475 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2475 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: Pramod Immaneni >Assignee: Oliver Winke > Fix For: 3.8.0 > > > The db CacheStore implementation supports expiry of data on read or write > after a configurable expiry period. The default is one minute. If the data is > read-only there is no need to expire this data. The max cache size property > will anyway ensure that the cache size doesn't grow indefinitely. The > CacheManager can provide the meta-information whether the data is read-only. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (APEXMALHAR-2474) FSLoader only returns value at the beginning
[ https://issues.apache.org/jira/browse/APEXMALHAR-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pramod Immaneni resolved APEXMALHAR-2474. - Resolution: Fixed Fix Version/s: 3.8.0 > FSLoader only returns value at the beginning > > > Key: APEXMALHAR-2474 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2474 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: Pramod Immaneni >Assignee: Oliver Winke > Fix For: 3.8.0 > > > FSLoader implements Backup store for db CacheManager. In the initial load, it > reads all the lines of the file, line by line, and returns a Map of key-value > pairs with a key-value pair for every line. It returns data only on the > initial load and thereafter it returns null for any key lookup. Also, there > is no need to load all the data in the file and return it if the primary > cache cannot hold all the entries. These issues need to be addressed and it > also helps if the CacheManager supplies meta-information such as how much > information should be loaded and returned in the initial load. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (APEXMALHAR-2473) Support for global cache meta information in db CacheManager
[ https://issues.apache.org/jira/browse/APEXMALHAR-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pramod Immaneni resolved APEXMALHAR-2473. - Resolution: Fixed Fix Version/s: 3.8.0 > Support for global cache meta information in db CacheManager > > > Key: APEXMALHAR-2473 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2473 > Project: Apache Apex Malhar > Issue Type: Improvement >Reporter: Pramod Immaneni >Assignee: Oliver Winke > Fix For: 3.8.0 > > > Currently db CacheManager has no knowledge of characteristics of the data or > the cache stores, so it handles all scenarios uniformly. This may not be the > optimal implementation in all cases. Better optimizations can be performed in > the manager if this information is known. A few examples, if the data is > read-only the keys in the primary cache need not be refreshed like they are > being done daily today, if the primary cache size is known the number of > initial entries loaded from backup needn't exceed it. Add support for such > general cache meta information in the manager. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (APEXMALHAR-2473) Support for global cache meta information in db CacheManager
[ https://issues.apache.org/jira/browse/APEXMALHAR-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016435#comment-16016435 ] ASF GitHub Bot commented on APEXMALHAR-2473: Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/605 > Support for global cache meta information in db CacheManager > > > Key: APEXMALHAR-2473 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2473 > Project: Apache Apex Malhar > Issue Type: Improvement >Reporter: Pramod Immaneni >Assignee: Oliver Winke > > Currently db CacheManager has no knowledge of characteristics of the data or > the cache stores, so it handles all scenarios uniformly. This may not be the > optimal implementation in all cases. Better optimizations can be performed in > the manager if this information is known. A few examples, if the data is > read-only the keys in the primary cache need not be refreshed like they are > being done daily today, if the primary cache size is known the number of > initial entries loaded from backup needn't exceed it. Add support for such > general cache meta information in the manager. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] apex-malhar pull request #605: APEXMALHAR-2473 Support for global cache meta...
Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/605 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (APEXMALHAR-2494) Update demo apps with description
Chaitanya created APEXMALHAR-2494: - Summary: Update demo apps with description Key: APEXMALHAR-2494 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2494 Project: Apache Apex Malhar Issue Type: Improvement Reporter: Chaitanya Assignee: Chaitanya Priority: Minor Add Readme for the demo apps which are under examples package. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (APEXMALHAR-2493) KafkaSinglePortExactlyOnceOutputOperator going to the blocked state during recovery
[ https://issues.apache.org/jira/browse/APEXMALHAR-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015556#comment-16015556 ] ASF GitHub Bot commented on APEXMALHAR-2493: GitHub user chaithu14 opened a pull request: https://github.com/apache/apex-malhar/pull/622 APEXMALHAR-2493 Fixed the issue of KafkaSinglePortExactlyOnceOutputOperator going to the blocked state during recovery @sandeshh @tushargosavi Please review and merge. You can merge this pull request into a Git repository by running: $ git pull https://github.com/chaithu14/incubator-apex-malhar APEXMALHAR-2493-KafkaExactlyCBBug Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/622.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #622 commit c784f4da46d1cf594aa4156135b9c196aa66d931 Author: chaitanyaDate: 2017-05-18T10:37:52Z APEXMALHAR-2493 Fixed the issue of KafkaSinglePortExactlyOnceOutputOperator going to the blocked state during recovery > KafkaSinglePortExactlyOnceOutputOperator going to the blocked state during > recovery > --- > > Key: APEXMALHAR-2493 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2493 > Project: Apache Apex Malhar > Issue Type: Bug >Reporter: Chaitanya >Assignee: Chaitanya > > Steps to reproduce the issue: > --- > - Created the Kafka topic with single partition. > - Created the application with the following DAG: > BatchSequenceGenerator -> KafkaSinglePortExactlyOnceOutputOperator > # of partitions of KafkaSinglePortExactlyOnceOutputOperator = 2. > Let's say KO1, KO2 are the two instances. > - Launched the app, after some time, manually killed the one of the instance > of "KafkaSinglePortExactlyOnceOutputOperator" operator(KO2). > - During recovery, the instance comes up and after some time, it goes to the > blocked state. App master killed this instance. > Observation: > > * There is an infinite while loop in rebuildPartialWindow() method. > * While loop will break on the below 2 conditions: >a) # of trails for "polled records from Kafka is empty" = 10 >b) Crossed boundary (consumerRecord.offset() >= currentOffset) > In this scenario, KO1 keeps on writing the data to Kafka. So, the first > condition will not satisfy. > Operator is not checking the 2nd condition because of the below continue > statement: > if (!doesKeyBelongsToThisInstance(operatorId, > consumerRecord.key())) { > continue; > } > Solution: First check the cross boundary condition and then check the > doesKeyBelongsToThisInstance(..). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] apex-malhar pull request #622: APEXMALHAR-2493 Fixed the issue of KafkaSingl...
GitHub user chaithu14 opened a pull request: https://github.com/apache/apex-malhar/pull/622 APEXMALHAR-2493 Fixed the issue of KafkaSinglePortExactlyOnceOutputOperator going to the blocked state during recovery @sandeshh @tushargosavi Please review and merge. You can merge this pull request into a Git repository by running: $ git pull https://github.com/chaithu14/incubator-apex-malhar APEXMALHAR-2493-KafkaExactlyCBBug Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/622.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #622 commit c784f4da46d1cf594aa4156135b9c196aa66d931 Author: chaitanyaDate: 2017-05-18T10:37:52Z APEXMALHAR-2493 Fixed the issue of KafkaSinglePortExactlyOnceOutputOperator going to the blocked state during recovery --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (APEXMALHAR-2493) KafkaSinglePortExactlyOnceOutputOperator going to the blocked state during recovery
Chaitanya created APEXMALHAR-2493: - Summary: KafkaSinglePortExactlyOnceOutputOperator going to the blocked state during recovery Key: APEXMALHAR-2493 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2493 Project: Apache Apex Malhar Issue Type: Bug Reporter: Chaitanya Assignee: Chaitanya Steps to reproduce the issue: --- - Created the Kafka topic with single partition. - Created the application with the following DAG: BatchSequenceGenerator -> KafkaSinglePortExactlyOnceOutputOperator # of partitions of KafkaSinglePortExactlyOnceOutputOperator = 2. Let's say KO1, KO2 are the two instances. - Launched the app, after some time, manually killed the one of the instance of "KafkaSinglePortExactlyOnceOutputOperator" operator(KO2). - During recovery, the instance comes up and after some time, it goes to the blocked state. App master killed this instance. Observation: * There is an infinite while loop in rebuildPartialWindow() method. * While loop will break on the below 2 conditions: a) # of trails for "polled records from Kafka is empty" = 10 b) Crossed boundary (consumerRecord.offset() >= currentOffset) In this scenario, KO1 keeps on writing the data to Kafka. So, the first condition will not satisfy. Operator is not checking the 2nd condition because of the below continue statement: if (!doesKeyBelongsToThisInstance(operatorId, consumerRecord.key())) { continue; } Solution: First check the cross boundary condition and then check the doesKeyBelongsToThisInstance(..). -- This message was sent by Atlassian JIRA (v6.3.15#6346)