Re: [VOTE] Apache Apex Malhar Release 3.6.0 (RC1)
+1 (binding) Verified checksums Verified LICENSE, NOTICE and README.md. Build with: mvn clean apache-rat:check verify -Dlicense.skip=false -Pall-modules install -DskipTests Thank you, Vlad On 11/30/16 22:56, Bhupesh Chawda wrote: +1 - Verified signatures - Build and test successful - LICENSE, NOTICE, README, CHANGELOG.md exist ~ Bhupesh On Thu, Dec 1, 2016 at 11:24 AM, David Yanwrote: +1 (binding) Verified existance of LICENSE, NOTICE, README.md and CHANGELOG.md files Built with this command: mvn clean apache-rat:check verify -Dlicense.skip=false -Pall-modules install with no errors. Verified pi demo On Wed, Nov 30, 2016 at 11:37 AM, Siyuan Hua wrote: +1 Verified checksums Verified compilation Verified build and test Verified pi demo On Wed, Nov 30, 2016 at 9:50 AM, Tushar Gosavi wrote: +1 Verified checksums Verified compilation - Tushar. On Wed, Nov 30, 2016 at 7:43 PM, Thomas Weise wrote: Can folks please verify the release. Thanks -- sent from mobile On Nov 26, 2016 6:32 PM, "Thomas Weise" wrote: Dear Community, Please vote on the following Apache Apex Malhar 3.6.0 release candidate. This is a source release with binary artifacts published to Maven. This release is based on Apex Core 3.4 and resolves 69 issues. The release adds first iteration of SQL support via Apache Calcite, an alternative Cassandra output operator (non-transactional, upsert based), enrichment operator, improvements to window storage and new user documentation for several operators along with many other enhancements and bug fixes. List of all issues fixed: https://s.apache.org/9b0t User documentation: http://apex.apache.org/docs/malhar-3.6/ Staging directory: https://dist.apache.org/repos/dist/dev/apex/apache-apex- malhar-3.6.0-RC1/ Source zip: https://dist.apache.org/repos/dist/dev/apex/apache-apex- malhar-3.6.0-RC1/apache-apex-malhar-3.6.0-source-release.zip Source tar.gz: https://dist.apache.org/repos/dist/dev/apex/apache-apex- malhar-3.6.0-RC1/apache-apex-malhar-3.6.0-source-release.tar.gz Maven staging repository: https://repository.apache.org/content/repositories/ orgapacheapex-1020/ Git source: https://git-wip-us.apache.org/repos/asf?p=apex-malhar.git;a= commit;h=refs/tags/v3.6.0-RC1 (commit: 43d524dc5d5326b8d94593901cad026528bb62a1) PGP key: http://pgp.mit.edu:11371/pks/lookup?op=vindex=t...@apache.org KEYS file: https://dist.apache.org/repos/dist/release/apex/KEYS More information at: http://apex.apache.org Please try the release and vote; vote will be open util Wed, 11/30 EOD PST considering the US holiday weekend. [ ] +1 approve (and what verification was done) [ ] -1 disapprove (and reason why) http://www.apache.org/foundation/voting.html How to verify release candidate: http://apex.apache.org/verification.html Thanks, Thomas
Re: "ExcludeNodes" for an Apex application
Yes, Ram explained to me that in practice this would be a useful feature for Apex devops who typically have no control over Hadoop/Yarn cluster. On 11/30/16, 9:22 PM, "Mohit Jotwani"wrote: This is a practical scenario where developers would be required to exclude certain nodes as they might be required for some mission critical applications. It would be good to have this feature. I understand that Stram should not get into resourcing and still rely on Yarn, however, as the App Master it should have the right to reject the nodes offered by Yarn and request for other resources. Regards, Mohit On Thu, Dec 1, 2016 at 2:34 AM, Sandesh Hegde wrote: > Apex has automatic blacklisting of the troublesome nodes, please take a > look at the following attributes, > > MAX_CONSECUTIVE_CONTAINER_FAILURES_FOR_BLACKLIST > https://www.datatorrent.com/docs/apidocs/com/datatorrent/ > api/Context.DAGContext.html#MAX_CONSECUTIVE_CONTAINER_ > FAILURES_FOR_BLACKLIST > > BLACKLISTED_NODE_REMOVAL_TIME_MILLIS > > Thanks > > > > On Wed, Nov 30, 2016 at 12:56 PM Munagala Ramanath > wrote: > > Not sure if this is what Milind had in mind but we often run into > situations where the dev group > working with Apex has no control over cluster configuration -- to make any > changes to the cluster they need to > go through an elaborate process that can take many days. > > Meanwhile, if they notice that a particular node is consistently causing > problems for their > app, having a simple way to exclude it would be very helpful since it gives > them a way > to bypass communication and process issues within their own organization. > > Ram > > On Wed, Nov 30, 2016 at 10:58 AM, Sanjay Pujare > wrote: > > > To me both use cases appear to be generic resource management use cases. > > For example, a randomly rebooting node is not good for any purpose esp. > > long running apps so it is a bit of a stretch to imagine that these nodes > > will be acceptable for some batch jobs in Yarn. So such a node should be > > marked “Bad” or Unavailable in Yarn itself. > > > > Second use case is also typical anti-affinity use case which ideally > > should be implemented in Yarn – Milind’s example can also apply to > non-Apex > > batch jobs. In any case it looks like Yarn still doesn’t have it ( > > https://issues.apache.org/jira/browse/YARN-1042) so if Apex needs it we > > will need to do it ourselves. > > > > On 11/30/16, 10:39 AM, "Munagala Ramanath" wrote: > > > > But then, what's the solution to the 2 problem scenarios that Milind > > describes ? > > > > Ram > > > > On Wed, Nov 30, 2016 at 10:34 AM, Sanjay Pujare < > > san...@datatorrent.com> > > wrote: > > > > > I think “exclude nodes” and such is really the job of the resource > > manager > > > i.e. Yarn. So I am not sure taking over some of these tasks in Apex > > would > > > be very useful. > > > > > > I agree with Amol that apps should be node neutral. Resource > > management in > > > Yarn together with fault tolerance in Apex should minimize the need > > for > > > this feature although I am sure one can find use cases. > > > > > > > > > On 11/29/16, 10:41 PM, "Amol Kekre" wrote: > > > > > > We do have this feature in Yarn, but that applies to all > > applications. > > > I am > > > not sure if Yarn has anti-affinity. This feature may be used, > > but in > > > general there is danger is an application taking over resource > > > allocation. > > > Another quirk is that big data apps should ideally be > > node-neutral. > > > This is > > > a good idea, if we are able to carve out something where need > is > > app > > > specific. > > > > > > Thks > > > Amol > > > > > > > > > On Tue, Nov 29, 2016 at 10:00 PM, Milind Barve < > > mili...@gmail.com> > > > wrote: > > > > > > > We have seen 2 cases mentioned below, where, it would have > > been nice > > > if > > > > Apex allowed us to exclude a node from the cluster for an > > > application. > > > > > > > > 1. A node in the cluster had gone bad (was randomly > rebooting) > > and > > > so an > > > > Apex app should not use it - other apps can use it as they > were > > > batch jobs. > > > > 2. A node is
Re: [VOTE] Apache Apex Malhar Release 3.6.0 (RC1)
+1 (binding) Verified existance of LICENSE, NOTICE, README.md and CHANGELOG.md files Built with this command: mvn clean apache-rat:check verify -Dlicense.skip=false -Pall-modules install with no errors. Verified pi demo On Wed, Nov 30, 2016 at 11:37 AM, Siyuan Huawrote: > +1 > > Verified checksums > Verified compilation > Verified build and test > Verified pi demo > > On Wed, Nov 30, 2016 at 9:50 AM, Tushar Gosavi > wrote: > > > +1 > > > > Verified checksums > > Verified compilation > > > > - Tushar. > > > > > > On Wed, Nov 30, 2016 at 7:43 PM, Thomas Weise wrote: > > > Can folks please verify the release. > > > > > > Thanks > > > > > > -- > > > sent from mobile > > > On Nov 26, 2016 6:32 PM, "Thomas Weise" wrote: > > > > > >> Dear Community, > > >> > > >> Please vote on the following Apache Apex Malhar 3.6.0 release > candidate. > > >> > > >> This is a source release with binary artifacts published to Maven. > > >> > > >> This release is based on Apex Core 3.4 and resolves 69 issues. > > >> > > >> The release adds first iteration of SQL support via Apache Calcite, an > > >> alternative Cassandra output operator (non-transactional, upsert > based), > > >> enrichment operator, improvements to window storage and new user > > >> documentation for several operators along with many other enhancements > > and > > >> bug fixes. > > >> > > >> List of all issues fixed: https://s.apache.org/9b0t > > >> User documentation: http://apex.apache.org/docs/malhar-3.6/ > > >> > > >> Staging directory: > > >> https://dist.apache.org/repos/dist/dev/apex/apache-apex- > > malhar-3.6.0-RC1/ > > >> Source zip: > > >> https://dist.apache.org/repos/dist/dev/apex/apache-apex- > > >> malhar-3.6.0-RC1/apache-apex-malhar-3.6.0-source-release.zip > > >> Source tar.gz: > > >> https://dist.apache.org/repos/dist/dev/apex/apache-apex- > > >> malhar-3.6.0-RC1/apache-apex-malhar-3.6.0-source-release.tar.gz > > >> Maven staging repository: > > >> https://repository.apache.org/content/repositories/ > orgapacheapex-1020/ > > >> > > >> Git source: > > >> https://git-wip-us.apache.org/repos/asf?p=apex-malhar.git;a= > > >> commit;h=refs/tags/v3.6.0-RC1 > > >> (commit: 43d524dc5d5326b8d94593901cad026528bb62a1) > > >> > > >> PGP key: > > >> http://pgp.mit.edu:11371/pks/lookup?op=vindex=t...@apache.org > > >> KEYS file: > > >> https://dist.apache.org/repos/dist/release/apex/KEYS > > >> > > >> More information at: > > >> http://apex.apache.org > > >> > > >> Please try the release and vote; vote will be open util Wed, 11/30 EOD > > PST > > >> considering the US holiday weekend. > > >> > > >> [ ] +1 approve (and what verification was done) > > >> [ ] -1 disapprove (and reason why) > > >> > > >> http://www.apache.org/foundation/voting.html > > >> > > >> How to verify release candidate: > > >> > > >> http://apex.apache.org/verification.html > > >> > > >> Thanks, > > >> Thomas > > >> > > >> > > >
[jira] [Resolved] (APEXMALHAR-2022) S3 Output Module for file copy
[ https://issues.apache.org/jira/browse/APEXMALHAR-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhupesh Chawda resolved APEXMALHAR-2022. Resolution: Fixed Fix Version/s: 3.7.0 > S3 Output Module for file copy > -- > > Key: APEXMALHAR-2022 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2022 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: Chaitanya >Assignee: Chaitanya > Fix For: 3.7.0 > > > Primary functionality of this module is copy files into S3 bucket using > block-by-block approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-2022) S3 Output Module for file copy
[ https://issues.apache.org/jira/browse/APEXMALHAR-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710953#comment-15710953 ] ASF GitHub Bot commented on APEXMALHAR-2022: Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/483 > S3 Output Module for file copy > -- > > Key: APEXMALHAR-2022 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2022 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: Chaitanya >Assignee: Chaitanya > > Primary functionality of this module is copy files into S3 bucket using > block-by-block approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #483: APEXMALHAR-2022 Developed S3 Output Module
Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/483 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: [DISCUSSION] Custom Control Tuples
I would like to work on https://issues.apache.org/jira/browse/APEXCORE-580. ~ Bhupesh On Thu, Dec 1, 2016 at 5:42 AM, Sandesh Hegdewrote: > I am interested in working on the following subtask > > https://issues.apache.org/jira/browse/APEXCORE-581 > > Thanks > > > On Wed, Nov 30, 2016 at 2:07 PM David Yan wrote: > > > I have created an umbrella ticket for control tuple support: > > > > https://issues.apache.org/jira/browse/APEXCORE-579 > > > > Currently it has two subtasks. Please have a look at them and see whether > > I'm missing anything or if you have anything to add. You are welcome to > add > > more subtasks or comment on the existing subtasks. > > > > We would like to kick start the implementation soon. > > > > Thanks! > > > > David > > > > On Mon, Nov 28, 2016 at 5:22 PM, Bhupesh Chawda > > > wrote: > > > > > +1 for the plan. > > > > > > I would be interested in contributing to this feature. > > > > > > ~ Bhupesh > > > > > > On Nov 29, 2016 03:26, "Sandesh Hegde" > wrote: > > > > > > > I am interested in contributing to this feature. > > > > > > > > On Mon, Nov 28, 2016 at 1:54 PM David Yan > > wrote: > > > > > > > > > I think we should probably go ahead with option 1 since this works > > with > > > > > most use cases and prevents developers from shooting themselves in > > the > > > > foot > > > > > in terms of idempotency. > > > > > > > > > > We can have a configuration property that enables option 2 later if > > we > > > > have > > > > > concrete use cases that call for it. > > > > > > > > > > Please share your thoughts if you think you don't agree with this > > plan. > > > > > Also, please indicate if you're interested in contributing to this > > > > feature. > > > > > > > > > > David > > > > > > > > > > On Sun, Nov 27, 2016 at 9:02 PM, Bhupesh Chawda < > > > bhup...@datatorrent.com > > > > > > > > > > wrote: > > > > > > > > > > > It appears that option 1 is more favored due to unavailability > of a > > > use > > > > > > case which could use option 2. > > > > > > > > > > > > However, option 2 is problematic in specific cases, like presence > > of > > > > > > multiple input ports for example. In case of a linear DAG where > > > control > > > > > > tuples are flowing in order with the data tuples, it should not > be > > > > > > difficult to guarantee idempotency. For example, cases where > there > > > > could > > > > > be > > > > > > multiple changes in behavior of an operator during a single > window, > > > it > > > > > > should not wait until end window for these changes to take > effect. > > > > Since, > > > > > > we don't have a concrete use case right now, perhaps we do not > want > > > to > > > > go > > > > > > that road. This feature should be available through a platform > > > > attribute > > > > > > (may be at a later point in time) where the default is option 1. > > > > > > > > > > > > I think option 1 is suitable for a starting point in the > > > implementation > > > > > of > > > > > > this feature and we should proceed with it. > > > > > > > > > > > > ~ Bhupesh > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Nov 11, 2016 at 12:59 AM, David Yan < > da...@datatorrent.com > > > > > > > > wrote: > > > > > > > > > > > > > Good question Tushar. The callback should be called only once. > > > > > > > The way to implement this is to keep a list of control tuple > > hashes > > > > for > > > > > > the > > > > > > > given streaming window and only do the callback when the > operator > > > has > > > > > not > > > > > > > seen it before. > > > > > > > > > > > > > > Other thoughts? > > > > > > > > > > > > > > David > > > > > > > > > > > > > > On Thu, Nov 10, 2016 at 9:32 AM, Tushar Gosavi < > > > > tus...@datatorrent.com > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi David, > > > > > > > > > > > > > > > > What would be the behaviour in case where we have a DAG with > > > > > following > > > > > > > > operators, the number in bracket is number of partitions, X > is > > > NxM > > > > > > > > partitioning. > > > > > > > > A(1) X B(4) X C(2) > > > > > > > > > > > > > > > > If A sends a control tuple, it will be sent to all 4 > partition > > of > > > > B, > > > > > > > > and from each partition from B it goes to C, i.e each > partition > > > of > > > > C > > > > > > > > will receive same control tuple originated from A multiple > > times > > > > > > > > (number of upstream partitions of C). In this case will the > > > > callback > > > > > > > > function get called multiple times or just once. > > > > > > > > > > > > > > > > -Tushar. > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Nov 4, 2016 at 12:14 AM, David Yan < > > > da...@datatorrent.com> > > > > > > > wrote: > > > > > > > > > Hi Bhupesh, > > > > > > > > > > > > > > > > > > Since each input port has its own incoming control tuple, I > > > would > > > > > > > imagine > > > > > > > > >
[jira] [Commented] (APEXMALHAR-2361) Optimise SpillableWindowedKeyedStorage remove(Window) to improve the performance
[ https://issues.apache.org/jira/browse/APEXMALHAR-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710326#comment-15710326 ] David Yan commented on APEXMALHAR-2361: --- I think removing it is better since you don't want to trigger again next time with the default value. > Optimise SpillableWindowedKeyedStorage remove(Window) to improve the > performance > > > Key: APEXMALHAR-2361 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2361 > Project: Apache Apex Malhar > Issue Type: Improvement >Reporter: bright chen >Assignee: bright chen > Original Estimate: 120h > Remaining Estimate: 120h > > Currently, SpillableWindowedKeyedStorage remove(Window) will go through each > key and mark all of them as deleted. It would be expensive when there are > lots of keys and especially these entry already spill out of memory (this the > common case when remove() was called). > Suggest to mark whole window as deleted. When the window was marked as > deleted, it will not allowed to add/update any entry of this window ( this > should match the requirement as remove(Window) only be called after allowed > lateness -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-2361) Optimise SpillableWindowedKeyedStorage remove(Window) to improve the performance
[ https://issues.apache.org/jira/browse/APEXMALHAR-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710303#comment-15710303 ] bright chen commented on APEXMALHAR-2361: - Probably we can handle DISCARDING by update value to default value instead of clear the window. Then remove(Window) only used for window after lateness. > Optimise SpillableWindowedKeyedStorage remove(Window) to improve the > performance > > > Key: APEXMALHAR-2361 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2361 > Project: Apache Apex Malhar > Issue Type: Improvement >Reporter: bright chen >Assignee: bright chen > Original Estimate: 120h > Remaining Estimate: 120h > > Currently, SpillableWindowedKeyedStorage remove(Window) will go through each > key and mark all of them as deleted. It would be expensive when there are > lots of keys and especially these entry already spill out of memory (this the > common case when remove() was called). > Suggest to mark whole window as deleted. When the window was marked as > deleted, it will not allowed to add/update any entry of this window ( this > should match the requirement as remove(Window) only be called after allowed > lateness -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXCORE-581) Delivery of Custom Control Tuples
[ https://issues.apache.org/jira/browse/APEXCORE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan updated APEXCORE-581: --- Description: The behavior should be as follow: - The control tuples should only be sent to downstream at streaming window boundaries - The control tuples should be sent to all partitions downstream - The control tuples should be sent in the same order of arrival. - Within a streaming window, do not send the same control tuple twice, even if the same control tuple is received multiple times within that window. This is possible if the operator has two input ports. (The LinkedHashMap should be easily able to ensure both order and uniqueness.) - The delivery of control tuples needs to stop at DelayOperator. - When a streaming window is committed, remove the associated LinkedHashMap that belong to windows with IDs that are less than the committed window - It's safe to assume the control tuples are rare enough and can fit in memory This will involve an additional MessageType to represent a custom control tuple. We probably need to have a data structure (possibly a LinkedHashMap) per streaming window that stores the control tuple in the buffer server. was: The behavior should be as follow: - The control tuples should only be sent to downstream at streaming window boundaries - The control tuples should be sent to all partitions downstream - The control tuples should be sent in the same order of arrival. - Within a streaming window, do not send the same control tuple twice, even if the same control tuple is received multiple times within that window. This is possible if the operator has two input ports. (The LinkedHashMap should be easily able to do ensure both order and uniqueness.) - The delivery of control tuples needs to stop at DelayOperator. - When a streaming window is committed, remove the associated LinkedHashMap that belong to windows with IDs that are less than the committed window - It's safe to assume the control tuples are rare enough and can fit in memory This will involve an additional MessageType to represent a custom control tuple. We probably need to have a data structure (possibly a LinkedHashMap) per streaming window that stores the control tuple in the buffer server. > Delivery of Custom Control Tuples > - > > Key: APEXCORE-581 > URL: https://issues.apache.org/jira/browse/APEXCORE-581 > Project: Apache Apex Core > Issue Type: Sub-task >Reporter: David Yan > > The behavior should be as follow: > - The control tuples should only be sent to downstream at streaming window > boundaries > - The control tuples should be sent to all partitions downstream > - The control tuples should be sent in the same order of arrival. > - Within a streaming window, do not send the same control tuple twice, even > if the same control tuple is received multiple times within that window. This > is possible if the operator has two input ports. (The LinkedHashMap should be > easily able to ensure both order and uniqueness.) > - The delivery of control tuples needs to stop at DelayOperator. > - When a streaming window is committed, remove the associated LinkedHashMap > that belong to windows with IDs that are less than the committed window > - It's safe to assume the control tuples are rare enough and can fit in memory > This will involve an additional MessageType to represent a custom control > tuple. > We probably need to have a data structure (possibly a LinkedHashMap) per > streaming window that stores the control tuple in the buffer server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: [DISCUSSION] Custom Control Tuples
I am interested in working on the following subtask https://issues.apache.org/jira/browse/APEXCORE-581 Thanks On Wed, Nov 30, 2016 at 2:07 PM David Yanwrote: > I have created an umbrella ticket for control tuple support: > > https://issues.apache.org/jira/browse/APEXCORE-579 > > Currently it has two subtasks. Please have a look at them and see whether > I'm missing anything or if you have anything to add. You are welcome to add > more subtasks or comment on the existing subtasks. > > We would like to kick start the implementation soon. > > Thanks! > > David > > On Mon, Nov 28, 2016 at 5:22 PM, Bhupesh Chawda > wrote: > > > +1 for the plan. > > > > I would be interested in contributing to this feature. > > > > ~ Bhupesh > > > > On Nov 29, 2016 03:26, "Sandesh Hegde" wrote: > > > > > I am interested in contributing to this feature. > > > > > > On Mon, Nov 28, 2016 at 1:54 PM David Yan > wrote: > > > > > > > I think we should probably go ahead with option 1 since this works > with > > > > most use cases and prevents developers from shooting themselves in > the > > > foot > > > > in terms of idempotency. > > > > > > > > We can have a configuration property that enables option 2 later if > we > > > have > > > > concrete use cases that call for it. > > > > > > > > Please share your thoughts if you think you don't agree with this > plan. > > > > Also, please indicate if you're interested in contributing to this > > > feature. > > > > > > > > David > > > > > > > > On Sun, Nov 27, 2016 at 9:02 PM, Bhupesh Chawda < > > bhup...@datatorrent.com > > > > > > > > wrote: > > > > > > > > > It appears that option 1 is more favored due to unavailability of a > > use > > > > > case which could use option 2. > > > > > > > > > > However, option 2 is problematic in specific cases, like presence > of > > > > > multiple input ports for example. In case of a linear DAG where > > control > > > > > tuples are flowing in order with the data tuples, it should not be > > > > > difficult to guarantee idempotency. For example, cases where there > > > could > > > > be > > > > > multiple changes in behavior of an operator during a single window, > > it > > > > > should not wait until end window for these changes to take effect. > > > Since, > > > > > we don't have a concrete use case right now, perhaps we do not want > > to > > > go > > > > > that road. This feature should be available through a platform > > > attribute > > > > > (may be at a later point in time) where the default is option 1. > > > > > > > > > > I think option 1 is suitable for a starting point in the > > implementation > > > > of > > > > > this feature and we should proceed with it. > > > > > > > > > > ~ Bhupesh > > > > > > > > > > > > > > > > > > > > On Fri, Nov 11, 2016 at 12:59 AM, David Yan > > > > > wrote: > > > > > > > > > > > Good question Tushar. The callback should be called only once. > > > > > > The way to implement this is to keep a list of control tuple > hashes > > > for > > > > > the > > > > > > given streaming window and only do the callback when the operator > > has > > > > not > > > > > > seen it before. > > > > > > > > > > > > Other thoughts? > > > > > > > > > > > > David > > > > > > > > > > > > On Thu, Nov 10, 2016 at 9:32 AM, Tushar Gosavi < > > > tus...@datatorrent.com > > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi David, > > > > > > > > > > > > > > What would be the behaviour in case where we have a DAG with > > > > following > > > > > > > operators, the number in bracket is number of partitions, X is > > NxM > > > > > > > partitioning. > > > > > > > A(1) X B(4) X C(2) > > > > > > > > > > > > > > If A sends a control tuple, it will be sent to all 4 partition > of > > > B, > > > > > > > and from each partition from B it goes to C, i.e each partition > > of > > > C > > > > > > > will receive same control tuple originated from A multiple > times > > > > > > > (number of upstream partitions of C). In this case will the > > > callback > > > > > > > function get called multiple times or just once. > > > > > > > > > > > > > > -Tushar. > > > > > > > > > > > > > > > > > > > > > On Fri, Nov 4, 2016 at 12:14 AM, David Yan < > > da...@datatorrent.com> > > > > > > wrote: > > > > > > > > Hi Bhupesh, > > > > > > > > > > > > > > > > Since each input port has its own incoming control tuple, I > > would > > > > > > imagine > > > > > > > > there would be an additional DefaultInputPort.processControl > > > method > > > > > > that > > > > > > > > operator developers can override. > > > > > > > > If we go for option 1, my thinking is that the control tuples > > > would > > > > > > > always > > > > > > > > be delivered at the next window boundary, even if the emit > > method > > > > is > > > > > > > called > > > > > > > > within a window. > > > > > > > > > > > > > > > > David > > > > > > > > > > > > > > > > On Thu, Nov 3, 2016 at
[GitHub] apex-malhar pull request #519: APEXMALHAR-2362 #resolve clearing the removed...
GitHub user davidyan74 opened a pull request: https://github.com/apache/apex-malhar/pull/519 APEXMALHAR-2362 #resolve clearing the removedSets at endWindow in SpillableSetMultimapImpl @brightchen please review and merge You can merge this pull request into a Git repository by running: $ git pull https://github.com/davidyan74/apex-malhar APEXMALHAR-2362 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/519.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #519 commit 87e4cbddaf82c72d6a6b08bbda3b33b25fb01765 Author: David YanDate: 2016-11-30T23:18:46Z APEXMALHAR-2362 #resolve clearing the removedSets at endWindow in SpillableSetMultimapImpl --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Moved] (APEXMALHAR-2362) SpillableSetMulitmapImpl.removedSets keeps growing
[ https://issues.apache.org/jira/browse/APEXMALHAR-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan moved APEXCORE-582 to APEXMALHAR-2362: Workflow: Default workflow, editable Closed status (was: jira) Key: APEXMALHAR-2362 (was: APEXCORE-582) Project: Apache Apex Malhar (was: Apache Apex Core) > SpillableSetMulitmapImpl.removedSets keeps growing > -- > > Key: APEXMALHAR-2362 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2362 > Project: Apache Apex Malhar > Issue Type: Bug >Reporter: David Yan >Assignee: David Yan > > That list is only added to but not removed and it will grow over time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXCORE-582) SpillableSetMulitmapImpl.removedSets keeps growing
David Yan created APEXCORE-582: -- Summary: SpillableSetMulitmapImpl.removedSets keeps growing Key: APEXCORE-582 URL: https://issues.apache.org/jira/browse/APEXCORE-582 Project: Apache Apex Core Issue Type: Bug Reporter: David Yan Assignee: David Yan That list is only added to but not removed and it will grow over time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-2361) Optimise SpillableWindowedKeyedStorage remove(Window) to improve the performance
[ https://issues.apache.org/jira/browse/APEXMALHAR-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710094#comment-15710094 ] David Yan commented on APEXMALHAR-2361: --- It makes sense. This potentially skips a lot of lookups because we can simply delete the windows and not worry about the keys in the windows. > Optimise SpillableWindowedKeyedStorage remove(Window) to improve the > performance > > > Key: APEXMALHAR-2361 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2361 > Project: Apache Apex Malhar > Issue Type: Improvement >Reporter: bright chen >Assignee: bright chen > Original Estimate: 120h > Remaining Estimate: 120h > > Currently, SpillableWindowedKeyedStorage remove(Window) will go through each > key and mark all of them as deleted. It would be expensive when there are > lots of keys and especially these entry already spill out of memory (this the > common case when remove() was called). > Suggest to mark whole window as deleted. When the window was marked as > deleted, it will not allowed to add/update any entry of this window ( this > should match the requirement as remove(Window) only be called after allowed > lateness -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-2339) Windowed Operator benchmarking
[ https://issues.apache.org/jira/browse/APEXMALHAR-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15710031#comment-15710031 ] bright chen commented on APEXMALHAR-2339: - SpillableWindowedKeyedStorage.remove(Window) will go through each entry of this window and mark as deleted. It would be expensive. Suggest to optimize it. See https://issues.apache.org/jira/browse/APEXMALHAR-2361 > Windowed Operator benchmarking > -- > > Key: APEXMALHAR-2339 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2339 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: bright chen >Assignee: bright chen > Attachments: Screen Shot 2016-11-21 at 10.34.38 AM.png > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXMALHAR-2361) Optimise SpillableWindowedKeyedStorage remove(Window) to improve the performance
bright chen created APEXMALHAR-2361: --- Summary: Optimise SpillableWindowedKeyedStorage remove(Window) to improve the performance Key: APEXMALHAR-2361 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2361 Project: Apache Apex Malhar Issue Type: Improvement Reporter: bright chen Assignee: bright chen Currently, SpillableWindowedKeyedStorage remove(Window) will go through each key and mark all of them as deleted. It would be expensive when there are lots of keys and especially these entry already spill out of memory (this the common case when remove() was called). Suggest to mark whole window as deleted. When the window was marked as deleted, it will not allowed to add/update any entry of this window ( this should match the requirement as remove(Window) only be called after allowed lateness -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-2359) Optimise fire trigger to avoid go through all data
[ https://issues.apache.org/jira/browse/APEXMALHAR-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15709958#comment-15709958 ] ASF GitHub Bot commented on APEXMALHAR-2359: GitHub user brightchen opened a pull request: https://github.com/apache/apex-malhar/pull/518 APEXMALHAR-2359 #resolve #comment Optimise fire trigger to avoid go t… …hrough all data You can merge this pull request into a Git repository by running: $ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2359 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/518.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #518 commit 665452248dfd437c176e5364665266839943c30b Author: brightchenDate: 2016-11-29T23:05:09Z APEXMALHAR-2359 #resolve #comment Optimise fire trigger to avoid go through all data > Optimise fire trigger to avoid go through all data > -- > > Key: APEXMALHAR-2359 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2359 > Project: Apache Apex Malhar > Issue Type: Improvement >Reporter: bright chen >Assignee: bright chen > Original Estimate: 144h > Remaining Estimate: 144h > > KeyedWindowedOperatorImpl.fireNormalTrigger(Window, boolean) currently go > through each window and key to check value. The data collection could be very > huge as the discard period could be relative long time. If > fireOnlyUpdatedPanes is false probably there don't have much space to > improve. But if fireOnlyUpdatedPanes is true, we don't have to go through the > whole data collection. We only need to go through the window and key which > handle after last trigger. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #518: APEXMALHAR-2359 #resolve #comment Optimise fi...
GitHub user brightchen opened a pull request: https://github.com/apache/apex-malhar/pull/518 APEXMALHAR-2359 #resolve #comment Optimise fire trigger to avoid go t⦠â¦hrough all data You can merge this pull request into a Git repository by running: $ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2359 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/518.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #518 commit 665452248dfd437c176e5364665266839943c30b Author: brightchenDate: 2016-11-29T23:05:09Z APEXMALHAR-2359 #resolve #comment Optimise fire trigger to avoid go through all data --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: [DISCUSSION] Custom Control Tuples
I have created an umbrella ticket for control tuple support: https://issues.apache.org/jira/browse/APEXCORE-579 Currently it has two subtasks. Please have a look at them and see whether I'm missing anything or if you have anything to add. You are welcome to add more subtasks or comment on the existing subtasks. We would like to kick start the implementation soon. Thanks! David On Mon, Nov 28, 2016 at 5:22 PM, Bhupesh Chawdawrote: > +1 for the plan. > > I would be interested in contributing to this feature. > > ~ Bhupesh > > On Nov 29, 2016 03:26, "Sandesh Hegde" wrote: > > > I am interested in contributing to this feature. > > > > On Mon, Nov 28, 2016 at 1:54 PM David Yan wrote: > > > > > I think we should probably go ahead with option 1 since this works with > > > most use cases and prevents developers from shooting themselves in the > > foot > > > in terms of idempotency. > > > > > > We can have a configuration property that enables option 2 later if we > > have > > > concrete use cases that call for it. > > > > > > Please share your thoughts if you think you don't agree with this plan. > > > Also, please indicate if you're interested in contributing to this > > feature. > > > > > > David > > > > > > On Sun, Nov 27, 2016 at 9:02 PM, Bhupesh Chawda < > bhup...@datatorrent.com > > > > > > wrote: > > > > > > > It appears that option 1 is more favored due to unavailability of a > use > > > > case which could use option 2. > > > > > > > > However, option 2 is problematic in specific cases, like presence of > > > > multiple input ports for example. In case of a linear DAG where > control > > > > tuples are flowing in order with the data tuples, it should not be > > > > difficult to guarantee idempotency. For example, cases where there > > could > > > be > > > > multiple changes in behavior of an operator during a single window, > it > > > > should not wait until end window for these changes to take effect. > > Since, > > > > we don't have a concrete use case right now, perhaps we do not want > to > > go > > > > that road. This feature should be available through a platform > > attribute > > > > (may be at a later point in time) where the default is option 1. > > > > > > > > I think option 1 is suitable for a starting point in the > implementation > > > of > > > > this feature and we should proceed with it. > > > > > > > > ~ Bhupesh > > > > > > > > > > > > > > > > On Fri, Nov 11, 2016 at 12:59 AM, David Yan > > > wrote: > > > > > > > > > Good question Tushar. The callback should be called only once. > > > > > The way to implement this is to keep a list of control tuple hashes > > for > > > > the > > > > > given streaming window and only do the callback when the operator > has > > > not > > > > > seen it before. > > > > > > > > > > Other thoughts? > > > > > > > > > > David > > > > > > > > > > On Thu, Nov 10, 2016 at 9:32 AM, Tushar Gosavi < > > tus...@datatorrent.com > > > > > > > > > wrote: > > > > > > > > > > > Hi David, > > > > > > > > > > > > What would be the behaviour in case where we have a DAG with > > > following > > > > > > operators, the number in bracket is number of partitions, X is > NxM > > > > > > partitioning. > > > > > > A(1) X B(4) X C(2) > > > > > > > > > > > > If A sends a control tuple, it will be sent to all 4 partition of > > B, > > > > > > and from each partition from B it goes to C, i.e each partition > of > > C > > > > > > will receive same control tuple originated from A multiple times > > > > > > (number of upstream partitions of C). In this case will the > > callback > > > > > > function get called multiple times or just once. > > > > > > > > > > > > -Tushar. > > > > > > > > > > > > > > > > > > On Fri, Nov 4, 2016 at 12:14 AM, David Yan < > da...@datatorrent.com> > > > > > wrote: > > > > > > > Hi Bhupesh, > > > > > > > > > > > > > > Since each input port has its own incoming control tuple, I > would > > > > > imagine > > > > > > > there would be an additional DefaultInputPort.processControl > > method > > > > > that > > > > > > > operator developers can override. > > > > > > > If we go for option 1, my thinking is that the control tuples > > would > > > > > > always > > > > > > > be delivered at the next window boundary, even if the emit > method > > > is > > > > > > called > > > > > > > within a window. > > > > > > > > > > > > > > David > > > > > > > > > > > > > > On Thu, Nov 3, 2016 at 1:46 AM, Bhupesh Chawda < > > > > > bhup...@datatorrent.com> > > > > > > > wrote: > > > > > > > > > > > > > >> I have a question regarding the callback for a control tuple. > > Will > > > > it > > > > > be > > > > > > >> similar to InputPort::process() method? Something like > > > > > > >> InputPort::processControlTuple(t) > > > > > > >> ? Or will it be a method of the operator similar to > > beginWindow()? > > > > > > >> > > > > > > >> When we say that the control tuple will be delivered at
[jira] [Updated] (APEXCORE-581) Delivery of Custom Control Tuples
[ https://issues.apache.org/jira/browse/APEXCORE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan updated APEXCORE-581: --- Description: The behavior should be as follow: - The control tuples should only be sent to downstream at streaming window boundaries - The control tuples should be sent to all partitions downstream - The control tuples should be sent in the same order of arrival. - Within a streaming window, do not send the same control tuple twice, even if the same control tuple is received multiple times within that window. This is possible if the operator has two input ports. (The LinkedHashMap should be easily able to do ensure both order and uniqueness.) - The delivery of control tuples needs to stop at DelayOperator. - When a streaming window is committed, remove the associated LinkedHashMap that belong to windows with IDs that are less than the committed window - It's safe to assume the control tuples are rare enough and can fit in memory This will involve an additional MessageType to represent a custom control tuple. We probably need to have a data structure (possibly a LinkedHashMap) per streaming window that stores the control tuple in the buffer server. was: This will involve an additional MessageType to represent a custom control tuple. We probably need to have a data structure (possibly a LinkedHashMap) per streaming window that stores the control tuple in the buffer server. The behavior should be as follow: - The control tuples should only be sent to downstream at streaming window boundaries - The control tuples should be sent to all partitions downstream - The control tuples should be sent in the same order of arrival. - Within a streaming window, do not send the same control tuple twice, even if the same control tuple is received multiple times within that window. This is possible if the operator has two input ports. (The LinkedHashMap should be easily able to do ensure both order and uniqueness.) - The delivery of control tuples needs to stop at DelayOperator. - When a streaming window is committed, remove the associated LinkedHashMap that belong to windows with IDs that are less than the committed window - It's safe to assume the control tuples are rare enough and can fit in memory > Delivery of Custom Control Tuples > - > > Key: APEXCORE-581 > URL: https://issues.apache.org/jira/browse/APEXCORE-581 > Project: Apache Apex Core > Issue Type: Sub-task >Reporter: David Yan > > The behavior should be as follow: > - The control tuples should only be sent to downstream at streaming window > boundaries > - The control tuples should be sent to all partitions downstream > - The control tuples should be sent in the same order of arrival. > - Within a streaming window, do not send the same control tuple twice, even > if the same control tuple is received multiple times within that window. This > is possible if the operator has two input ports. (The LinkedHashMap should be > easily able to do ensure both order and uniqueness.) > - The delivery of control tuples needs to stop at DelayOperator. > - When a streaming window is committed, remove the associated LinkedHashMap > that belong to windows with IDs that are less than the committed window > - It's safe to assume the control tuples are rare enough and can fit in memory > This will involve an additional MessageType to represent a custom control > tuple. > We probably need to have a data structure (possibly a LinkedHashMap) per > streaming window that stores the control tuple in the buffer server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXCORE-580) Interface for processing and emitting control tuples
[ https://issues.apache.org/jira/browse/APEXCORE-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan updated APEXCORE-580: --- Description: DefaultOutputPort needs to have a emitControl method so that operator code can call to emit a control tuple. DefaultInputPort needs to have a processControl method so that operator would be able to act on the arrival of a control tuple. Similar to a regular data tuple, we also need to provide a way for the user to provide custom serialization for the control tuple. We need to design this so that the default behavior is to propagate control tuples to all output ports, and it should allow the user to easily change that behavior. The user can selectively propagate control tuples to certain output ports, or block the propagation altogether. was: DefaultOutputPort needs to have a emitControl method so that operator code can call to emit a control tuple. DefaultInputPort needs to have a processControl method so that operator would be able to act on the arrival of a control tuple. We need to design this so that the default behavior is to propagate control tuples to all output ports, and it should allow the user tp easily change that behavior. The user can selectively propagate control tuples to certain output ports, or block the propagation altogether. > Interface for processing and emitting control tuples > > > Key: APEXCORE-580 > URL: https://issues.apache.org/jira/browse/APEXCORE-580 > Project: Apache Apex Core > Issue Type: Sub-task >Reporter: David Yan > > DefaultOutputPort needs to have a emitControl method so that operator code > can call to emit a control tuple. > DefaultInputPort needs to have a processControl method so that operator would > be able to act on the arrival of a control tuple. > Similar to a regular data tuple, we also need to provide a way for the user > to provide custom serialization for the control tuple. > We need to design this so that the default behavior is to propagate control > tuples to all output ports, and it should allow the user to easily change > that behavior. The user can selectively propagate control tuples to certain > output ports, or block the propagation altogether. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXCORE-580) Interface for processing and emitting control tuples
[ https://issues.apache.org/jira/browse/APEXCORE-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan updated APEXCORE-580: --- Summary: Interface for processing and emitting control tuples (was: Add methods for processing and emitting control tuples) > Interface for processing and emitting control tuples > > > Key: APEXCORE-580 > URL: https://issues.apache.org/jira/browse/APEXCORE-580 > Project: Apache Apex Core > Issue Type: Sub-task >Reporter: David Yan > > DefaultOutputPort needs to have a emitControl method so that operator code > can call to emit a control tuple. > DefaultInputPort needs to have a processControl method so that operator would > be able to act on the arrival of a control tuple. > We need to design this so that the default behavior is to propagate control > tuples to all output ports, and it should allow the user tp easily change > that behavior. The user can selectively propagate control tuples to certain > output ports, or block the propagation altogether. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXCORE-580) Add methods for processing and emitting control tuples
[ https://issues.apache.org/jira/browse/APEXCORE-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan updated APEXCORE-580: --- Description: DefaultOutputPort needs to have a emitControl method so that operator code can call to emit a control tuple. DefaultInputPort needs to have a processControl method so that operator would be able to act on the arrival of a control tuple. We need to design this so that the default behavior is to propagate control tuples to all output ports, and it should allow the user tp easily change that behavior. The user can selectively propagate control tuples to certain output ports, or block the propagation altogether. was: DefaultOutputPort needs to have a emitControl method so that operator code can call to emit a control tuple. DefaultInputPort needs to have a processControl method so that operator would be able to act on the arrival of a control tuple. > Add methods for processing and emitting control tuples > -- > > Key: APEXCORE-580 > URL: https://issues.apache.org/jira/browse/APEXCORE-580 > Project: Apache Apex Core > Issue Type: Sub-task >Reporter: David Yan > > DefaultOutputPort needs to have a emitControl method so that operator code > can call to emit a control tuple. > DefaultInputPort needs to have a processControl method so that operator would > be able to act on the arrival of a control tuple. > We need to design this so that the default behavior is to propagate control > tuples to all output ports, and it should allow the user tp easily change > that behavior. The user can selectively propagate control tuples to certain > output ports, or block the propagation altogether. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXCORE-581) Delivery of Custom Control Tuples
David Yan created APEXCORE-581: -- Summary: Delivery of Custom Control Tuples Key: APEXCORE-581 URL: https://issues.apache.org/jira/browse/APEXCORE-581 Project: Apache Apex Core Issue Type: Sub-task Reporter: David Yan This will involve an additional MessageType to represent a custom control tuple. We probably need to have a data structure (possibly a LinkedHashMap) per streaming window that stores the control tuple in the buffer server. The behavior should be as follow: - The control tuples should only be sent to downstream at streaming window boundaries - The control tuples should be sent to all partitions downstream - The control tuples should be sent in the same order of arrival. - Within a streaming window, do not send the same control tuple twice, even if the same control tuple is received multiple times within that window. This is possible if the operator has two input ports. (The LinkedHashMap should be easily able to do ensure both order and uniqueness.) - The delivery of control tuples needs to stop at DelayOperator. - When a streaming window is committed, remove the associated LinkedHashMap that belong to windows with IDs that are less than the committed window - It's safe to assume the control tuples are rare enough and can fit in memory -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXCORE-580) Add methods for processing and emitting control tuples
David Yan created APEXCORE-580: -- Summary: Add methods for processing and emitting control tuples Key: APEXCORE-580 URL: https://issues.apache.org/jira/browse/APEXCORE-580 Project: Apache Apex Core Issue Type: Sub-task Reporter: David Yan DefaultOutputPort needs to have a emitControl method so that operator code can call to emit a control tuple. DefaultInputPort needs to have a processControl method so that operator would be able to act on the arrival of a control tuple. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: "ExcludeNodes" for an Apex application
Apex has automatic blacklisting of the troublesome nodes, please take a look at the following attributes, MAX_CONSECUTIVE_CONTAINER_FAILURES_FOR_BLACKLIST https://www.datatorrent.com/docs/apidocs/com/datatorrent/api/Context.DAGContext.html#MAX_CONSECUTIVE_CONTAINER_FAILURES_FOR_BLACKLIST BLACKLISTED_NODE_REMOVAL_TIME_MILLIS Thanks On Wed, Nov 30, 2016 at 12:56 PM Munagala Ramanathwrote: Not sure if this is what Milind had in mind but we often run into situations where the dev group working with Apex has no control over cluster configuration -- to make any changes to the cluster they need to go through an elaborate process that can take many days. Meanwhile, if they notice that a particular node is consistently causing problems for their app, having a simple way to exclude it would be very helpful since it gives them a way to bypass communication and process issues within their own organization. Ram On Wed, Nov 30, 2016 at 10:58 AM, Sanjay Pujare wrote: > To me both use cases appear to be generic resource management use cases. > For example, a randomly rebooting node is not good for any purpose esp. > long running apps so it is a bit of a stretch to imagine that these nodes > will be acceptable for some batch jobs in Yarn. So such a node should be > marked “Bad” or Unavailable in Yarn itself. > > Second use case is also typical anti-affinity use case which ideally > should be implemented in Yarn – Milind’s example can also apply to non-Apex > batch jobs. In any case it looks like Yarn still doesn’t have it ( > https://issues.apache.org/jira/browse/YARN-1042) so if Apex needs it we > will need to do it ourselves. > > On 11/30/16, 10:39 AM, "Munagala Ramanath" wrote: > > But then, what's the solution to the 2 problem scenarios that Milind > describes ? > > Ram > > On Wed, Nov 30, 2016 at 10:34 AM, Sanjay Pujare < > san...@datatorrent.com> > wrote: > > > I think “exclude nodes” and such is really the job of the resource > manager > > i.e. Yarn. So I am not sure taking over some of these tasks in Apex > would > > be very useful. > > > > I agree with Amol that apps should be node neutral. Resource > management in > > Yarn together with fault tolerance in Apex should minimize the need > for > > this feature although I am sure one can find use cases. > > > > > > On 11/29/16, 10:41 PM, "Amol Kekre" wrote: > > > > We do have this feature in Yarn, but that applies to all > applications. > > I am > > not sure if Yarn has anti-affinity. This feature may be used, > but in > > general there is danger is an application taking over resource > > allocation. > > Another quirk is that big data apps should ideally be > node-neutral. > > This is > > a good idea, if we are able to carve out something where need is > app > > specific. > > > > Thks > > Amol > > > > > > On Tue, Nov 29, 2016 at 10:00 PM, Milind Barve < > mili...@gmail.com> > > wrote: > > > > > We have seen 2 cases mentioned below, where, it would have > been nice > > if > > > Apex allowed us to exclude a node from the cluster for an > > application. > > > > > > 1. A node in the cluster had gone bad (was randomly rebooting) > and > > so an > > > Apex app should not use it - other apps can use it as they were > > batch jobs. > > > 2. A node is being used for a mission critical app (Could be > an Apex > > app > > > itself), but another Apex app which is mission critical should > not > > be using > > > resources on that node. > > > > > > Can we have a way in which, Stram and YARN can coordinate > between > > each > > > other to not use a set of nodes for the application. It an be > done > > in 2 way > > > s- > > > > > > 1. Have a list of "exclude" nodes with Stram- when YARN > allcates > > resources > > > on either of these, STRAM rejects and gets resources allocated > again > > frm > > > YARN > > > 2. Have a list of nodes that can be used for an app - This can > be a > > part of > > > config. Hwever, I don't think this would be a right way to do > so as > > we will > > > need support from YARN as well. Further, this might be > difficult to > > change > > > at runtim if need be. > > > > > > Any thoughts? > > > > > > > > > -- > > > ~Milind bee at gee mail dot com > > > > > > > > > > > > > > >
Re: "ExcludeNodes" for an Apex application
Not sure if this is what Milind had in mind but we often run into situations where the dev group working with Apex has no control over cluster configuration -- to make any changes to the cluster they need to go through an elaborate process that can take many days. Meanwhile, if they notice that a particular node is consistently causing problems for their app, having a simple way to exclude it would be very helpful since it gives them a way to bypass communication and process issues within their own organization. Ram On Wed, Nov 30, 2016 at 10:58 AM, Sanjay Pujarewrote: > To me both use cases appear to be generic resource management use cases. > For example, a randomly rebooting node is not good for any purpose esp. > long running apps so it is a bit of a stretch to imagine that these nodes > will be acceptable for some batch jobs in Yarn. So such a node should be > marked “Bad” or Unavailable in Yarn itself. > > Second use case is also typical anti-affinity use case which ideally > should be implemented in Yarn – Milind’s example can also apply to non-Apex > batch jobs. In any case it looks like Yarn still doesn’t have it ( > https://issues.apache.org/jira/browse/YARN-1042) so if Apex needs it we > will need to do it ourselves. > > On 11/30/16, 10:39 AM, "Munagala Ramanath" wrote: > > But then, what's the solution to the 2 problem scenarios that Milind > describes ? > > Ram > > On Wed, Nov 30, 2016 at 10:34 AM, Sanjay Pujare < > san...@datatorrent.com> > wrote: > > > I think “exclude nodes” and such is really the job of the resource > manager > > i.e. Yarn. So I am not sure taking over some of these tasks in Apex > would > > be very useful. > > > > I agree with Amol that apps should be node neutral. Resource > management in > > Yarn together with fault tolerance in Apex should minimize the need > for > > this feature although I am sure one can find use cases. > > > > > > On 11/29/16, 10:41 PM, "Amol Kekre" wrote: > > > > We do have this feature in Yarn, but that applies to all > applications. > > I am > > not sure if Yarn has anti-affinity. This feature may be used, > but in > > general there is danger is an application taking over resource > > allocation. > > Another quirk is that big data apps should ideally be > node-neutral. > > This is > > a good idea, if we are able to carve out something where need is > app > > specific. > > > > Thks > > Amol > > > > > > On Tue, Nov 29, 2016 at 10:00 PM, Milind Barve < > mili...@gmail.com> > > wrote: > > > > > We have seen 2 cases mentioned below, where, it would have > been nice > > if > > > Apex allowed us to exclude a node from the cluster for an > > application. > > > > > > 1. A node in the cluster had gone bad (was randomly rebooting) > and > > so an > > > Apex app should not use it - other apps can use it as they were > > batch jobs. > > > 2. A node is being used for a mission critical app (Could be > an Apex > > app > > > itself), but another Apex app which is mission critical should > not > > be using > > > resources on that node. > > > > > > Can we have a way in which, Stram and YARN can coordinate > between > > each > > > other to not use a set of nodes for the application. It an be > done > > in 2 way > > > s- > > > > > > 1. Have a list of "exclude" nodes with Stram- when YARN > allcates > > resources > > > on either of these, STRAM rejects and gets resources allocated > again > > frm > > > YARN > > > 2. Have a list of nodes that can be used for an app - This can > be a > > part of > > > config. Hwever, I don't think this would be a right way to do > so as > > we will > > > need support from YARN as well. Further, this might be > difficult to > > change > > > at runtim if need be. > > > > > > Any thoughts? > > > > > > > > > -- > > > ~Milind bee at gee mail dot com > > > > > > > > > > > > > > >
Re: [VOTE] Apache Apex Malhar Release 3.6.0 (RC1)
+1 Verified checksums Verified compilation Verified build and test Verified pi demo On Wed, Nov 30, 2016 at 9:50 AM, Tushar Gosaviwrote: > +1 > > Verified checksums > Verified compilation > > - Tushar. > > > On Wed, Nov 30, 2016 at 7:43 PM, Thomas Weise wrote: > > Can folks please verify the release. > > > > Thanks > > > > -- > > sent from mobile > > On Nov 26, 2016 6:32 PM, "Thomas Weise" wrote: > > > >> Dear Community, > >> > >> Please vote on the following Apache Apex Malhar 3.6.0 release candidate. > >> > >> This is a source release with binary artifacts published to Maven. > >> > >> This release is based on Apex Core 3.4 and resolves 69 issues. > >> > >> The release adds first iteration of SQL support via Apache Calcite, an > >> alternative Cassandra output operator (non-transactional, upsert based), > >> enrichment operator, improvements to window storage and new user > >> documentation for several operators along with many other enhancements > and > >> bug fixes. > >> > >> List of all issues fixed: https://s.apache.org/9b0t > >> User documentation: http://apex.apache.org/docs/malhar-3.6/ > >> > >> Staging directory: > >> https://dist.apache.org/repos/dist/dev/apex/apache-apex- > malhar-3.6.0-RC1/ > >> Source zip: > >> https://dist.apache.org/repos/dist/dev/apex/apache-apex- > >> malhar-3.6.0-RC1/apache-apex-malhar-3.6.0-source-release.zip > >> Source tar.gz: > >> https://dist.apache.org/repos/dist/dev/apex/apache-apex- > >> malhar-3.6.0-RC1/apache-apex-malhar-3.6.0-source-release.tar.gz > >> Maven staging repository: > >> https://repository.apache.org/content/repositories/orgapacheapex-1020/ > >> > >> Git source: > >> https://git-wip-us.apache.org/repos/asf?p=apex-malhar.git;a= > >> commit;h=refs/tags/v3.6.0-RC1 > >> (commit: 43d524dc5d5326b8d94593901cad026528bb62a1) > >> > >> PGP key: > >> http://pgp.mit.edu:11371/pks/lookup?op=vindex=t...@apache.org > >> KEYS file: > >> https://dist.apache.org/repos/dist/release/apex/KEYS > >> > >> More information at: > >> http://apex.apache.org > >> > >> Please try the release and vote; vote will be open util Wed, 11/30 EOD > PST > >> considering the US holiday weekend. > >> > >> [ ] +1 approve (and what verification was done) > >> [ ] -1 disapprove (and reason why) > >> > >> http://www.apache.org/foundation/voting.html > >> > >> How to verify release candidate: > >> > >> http://apex.apache.org/verification.html > >> > >> Thanks, > >> Thomas > >> > >> >
Re: "ExcludeNodes" for an Apex application
I agree, Randomly rebooting node is Yarn issue. Even anti-affinity between apps should be Yarn in long run. We could contribute the above jira. Thks Amol On Wed, Nov 30, 2016 at 10:58 AM, Sanjay Pujarewrote: > To me both use cases appear to be generic resource management use cases. > For example, a randomly rebooting node is not good for any purpose esp. > long running apps so it is a bit of a stretch to imagine that these nodes > will be acceptable for some batch jobs in Yarn. So such a node should be > marked “Bad” or Unavailable in Yarn itself. > > Second use case is also typical anti-affinity use case which ideally > should be implemented in Yarn – Milind’s example can also apply to non-Apex > batch jobs. In any case it looks like Yarn still doesn’t have it ( > https://issues.apache.org/jira/browse/YARN-1042) so if Apex needs it we > will need to do it ourselves. > > On 11/30/16, 10:39 AM, "Munagala Ramanath" wrote: > > But then, what's the solution to the 2 problem scenarios that Milind > describes ? > > Ram > > On Wed, Nov 30, 2016 at 10:34 AM, Sanjay Pujare < > san...@datatorrent.com> > wrote: > > > I think “exclude nodes” and such is really the job of the resource > manager > > i.e. Yarn. So I am not sure taking over some of these tasks in Apex > would > > be very useful. > > > > I agree with Amol that apps should be node neutral. Resource > management in > > Yarn together with fault tolerance in Apex should minimize the need > for > > this feature although I am sure one can find use cases. > > > > > > On 11/29/16, 10:41 PM, "Amol Kekre" wrote: > > > > We do have this feature in Yarn, but that applies to all > applications. > > I am > > not sure if Yarn has anti-affinity. This feature may be used, > but in > > general there is danger is an application taking over resource > > allocation. > > Another quirk is that big data apps should ideally be > node-neutral. > > This is > > a good idea, if we are able to carve out something where need is > app > > specific. > > > > Thks > > Amol > > > > > > On Tue, Nov 29, 2016 at 10:00 PM, Milind Barve < > mili...@gmail.com> > > wrote: > > > > > We have seen 2 cases mentioned below, where, it would have > been nice > > if > > > Apex allowed us to exclude a node from the cluster for an > > application. > > > > > > 1. A node in the cluster had gone bad (was randomly rebooting) > and > > so an > > > Apex app should not use it - other apps can use it as they were > > batch jobs. > > > 2. A node is being used for a mission critical app (Could be > an Apex > > app > > > itself), but another Apex app which is mission critical should > not > > be using > > > resources on that node. > > > > > > Can we have a way in which, Stram and YARN can coordinate > between > > each > > > other to not use a set of nodes for the application. It an be > done > > in 2 way > > > s- > > > > > > 1. Have a list of "exclude" nodes with Stram- when YARN > allcates > > resources > > > on either of these, STRAM rejects and gets resources allocated > again > > frm > > > YARN > > > 2. Have a list of nodes that can be used for an app - This can > be a > > part of > > > config. Hwever, I don't think this would be a right way to do > so as > > we will > > > need support from YARN as well. Further, this might be > difficult to > > change > > > at runtim if need be. > > > > > > Any thoughts? > > > > > > > > > -- > > > ~Milind bee at gee mail dot com > > > > > > > > > > > > > > >
Re: "ExcludeNodes" for an Apex application
To me both use cases appear to be generic resource management use cases. For example, a randomly rebooting node is not good for any purpose esp. long running apps so it is a bit of a stretch to imagine that these nodes will be acceptable for some batch jobs in Yarn. So such a node should be marked “Bad” or Unavailable in Yarn itself. Second use case is also typical anti-affinity use case which ideally should be implemented in Yarn – Milind’s example can also apply to non-Apex batch jobs. In any case it looks like Yarn still doesn’t have it (https://issues.apache.org/jira/browse/YARN-1042) so if Apex needs it we will need to do it ourselves. On 11/30/16, 10:39 AM, "Munagala Ramanath"wrote: But then, what's the solution to the 2 problem scenarios that Milind describes ? Ram On Wed, Nov 30, 2016 at 10:34 AM, Sanjay Pujare wrote: > I think “exclude nodes” and such is really the job of the resource manager > i.e. Yarn. So I am not sure taking over some of these tasks in Apex would > be very useful. > > I agree with Amol that apps should be node neutral. Resource management in > Yarn together with fault tolerance in Apex should minimize the need for > this feature although I am sure one can find use cases. > > > On 11/29/16, 10:41 PM, "Amol Kekre" wrote: > > We do have this feature in Yarn, but that applies to all applications. > I am > not sure if Yarn has anti-affinity. This feature may be used, but in > general there is danger is an application taking over resource > allocation. > Another quirk is that big data apps should ideally be node-neutral. > This is > a good idea, if we are able to carve out something where need is app > specific. > > Thks > Amol > > > On Tue, Nov 29, 2016 at 10:00 PM, Milind Barve > wrote: > > > We have seen 2 cases mentioned below, where, it would have been nice > if > > Apex allowed us to exclude a node from the cluster for an > application. > > > > 1. A node in the cluster had gone bad (was randomly rebooting) and > so an > > Apex app should not use it - other apps can use it as they were > batch jobs. > > 2. A node is being used for a mission critical app (Could be an Apex > app > > itself), but another Apex app which is mission critical should not > be using > > resources on that node. > > > > Can we have a way in which, Stram and YARN can coordinate between > each > > other to not use a set of nodes for the application. It an be done > in 2 way > > s- > > > > 1. Have a list of "exclude" nodes with Stram- when YARN allcates > resources > > on either of these, STRAM rejects and gets resources allocated again > frm > > YARN > > 2. Have a list of nodes that can be used for an app - This can be a > part of > > config. Hwever, I don't think this would be a right way to do so as > we will > > need support from YARN as well. Further, this might be difficult to > change > > at runtim if need be. > > > > Any thoughts? > > > > > > -- > > ~Milind bee at gee mail dot com > > > > > >
Re: "ExcludeNodes" for an Apex application
But then, what's the solution to the 2 problem scenarios that Milind describes ? Ram On Wed, Nov 30, 2016 at 10:34 AM, Sanjay Pujarewrote: > I think “exclude nodes” and such is really the job of the resource manager > i.e. Yarn. So I am not sure taking over some of these tasks in Apex would > be very useful. > > I agree with Amol that apps should be node neutral. Resource management in > Yarn together with fault tolerance in Apex should minimize the need for > this feature although I am sure one can find use cases. > > > On 11/29/16, 10:41 PM, "Amol Kekre" wrote: > > We do have this feature in Yarn, but that applies to all applications. > I am > not sure if Yarn has anti-affinity. This feature may be used, but in > general there is danger is an application taking over resource > allocation. > Another quirk is that big data apps should ideally be node-neutral. > This is > a good idea, if we are able to carve out something where need is app > specific. > > Thks > Amol > > > On Tue, Nov 29, 2016 at 10:00 PM, Milind Barve > wrote: > > > We have seen 2 cases mentioned below, where, it would have been nice > if > > Apex allowed us to exclude a node from the cluster for an > application. > > > > 1. A node in the cluster had gone bad (was randomly rebooting) and > so an > > Apex app should not use it - other apps can use it as they were > batch jobs. > > 2. A node is being used for a mission critical app (Could be an Apex > app > > itself), but another Apex app which is mission critical should not > be using > > resources on that node. > > > > Can we have a way in which, Stram and YARN can coordinate between > each > > other to not use a set of nodes for the application. It an be done > in 2 way > > s- > > > > 1. Have a list of "exclude" nodes with Stram- when YARN allcates > resources > > on either of these, STRAM rejects and gets resources allocated again > frm > > YARN > > 2. Have a list of nodes that can be used for an app - This can be a > part of > > config. Hwever, I don't think this would be a right way to do so as > we will > > need support from YARN as well. Further, this might be difficult to > change > > at runtim if need be. > > > > Any thoughts? > > > > > > -- > > ~Milind bee at gee mail dot com > > > > > >
Re: "ExcludeNodes" for an Apex application
I think “exclude nodes” and such is really the job of the resource manager i.e. Yarn. So I am not sure taking over some of these tasks in Apex would be very useful. I agree with Amol that apps should be node neutral. Resource management in Yarn together with fault tolerance in Apex should minimize the need for this feature although I am sure one can find use cases. On 11/29/16, 10:41 PM, "Amol Kekre"wrote: We do have this feature in Yarn, but that applies to all applications. I am not sure if Yarn has anti-affinity. This feature may be used, but in general there is danger is an application taking over resource allocation. Another quirk is that big data apps should ideally be node-neutral. This is a good idea, if we are able to carve out something where need is app specific. Thks Amol On Tue, Nov 29, 2016 at 10:00 PM, Milind Barve wrote: > We have seen 2 cases mentioned below, where, it would have been nice if > Apex allowed us to exclude a node from the cluster for an application. > > 1. A node in the cluster had gone bad (was randomly rebooting) and so an > Apex app should not use it - other apps can use it as they were batch jobs. > 2. A node is being used for a mission critical app (Could be an Apex app > itself), but another Apex app which is mission critical should not be using > resources on that node. > > Can we have a way in which, Stram and YARN can coordinate between each > other to not use a set of nodes for the application. It an be done in 2 way > s- > > 1. Have a list of "exclude" nodes with Stram- when YARN allcates resources > on either of these, STRAM rejects and gets resources allocated again frm > YARN > 2. Have a list of nodes that can be used for an app - This can be a part of > config. Hwever, I don't think this would be a right way to do so as we will > need support from YARN as well. Further, this might be difficult to change > at runtim if need be. > > Any thoughts? > > > -- > ~Milind bee at gee mail dot com >
Re: [VOTE] Apache Apex Malhar Release 3.6.0 (RC1)
Can folks please verify the release. Thanks -- sent from mobile On Nov 26, 2016 6:32 PM, "Thomas Weise"wrote: > Dear Community, > > Please vote on the following Apache Apex Malhar 3.6.0 release candidate. > > This is a source release with binary artifacts published to Maven. > > This release is based on Apex Core 3.4 and resolves 69 issues. > > The release adds first iteration of SQL support via Apache Calcite, an > alternative Cassandra output operator (non-transactional, upsert based), > enrichment operator, improvements to window storage and new user > documentation for several operators along with many other enhancements and > bug fixes. > > List of all issues fixed: https://s.apache.org/9b0t > User documentation: http://apex.apache.org/docs/malhar-3.6/ > > Staging directory: > https://dist.apache.org/repos/dist/dev/apex/apache-apex-malhar-3.6.0-RC1/ > Source zip: > https://dist.apache.org/repos/dist/dev/apex/apache-apex- > malhar-3.6.0-RC1/apache-apex-malhar-3.6.0-source-release.zip > Source tar.gz: > https://dist.apache.org/repos/dist/dev/apex/apache-apex- > malhar-3.6.0-RC1/apache-apex-malhar-3.6.0-source-release.tar.gz > Maven staging repository: > https://repository.apache.org/content/repositories/orgapacheapex-1020/ > > Git source: > https://git-wip-us.apache.org/repos/asf?p=apex-malhar.git;a= > commit;h=refs/tags/v3.6.0-RC1 > (commit: 43d524dc5d5326b8d94593901cad026528bb62a1) > > PGP key: > http://pgp.mit.edu:11371/pks/lookup?op=vindex=t...@apache.org > KEYS file: > https://dist.apache.org/repos/dist/release/apex/KEYS > > More information at: > http://apex.apache.org > > Please try the release and vote; vote will be open util Wed, 11/30 EOD PST > considering the US holiday weekend. > > [ ] +1 approve (and what verification was done) > [ ] -1 disapprove (and reason why) > > http://www.apache.org/foundation/voting.html > > How to verify release candidate: > > http://apex.apache.org/verification.html > > Thanks, > Thomas > >
[jira] [Commented] (APEXMALHAR-2022) S3 Output Module for file copy
[ https://issues.apache.org/jira/browse/APEXMALHAR-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708479#comment-15708479 ] ASF GitHub Bot commented on APEXMALHAR-2022: Github user chaithu14 closed the pull request at: https://github.com/apache/apex-malhar/pull/483 > S3 Output Module for file copy > -- > > Key: APEXMALHAR-2022 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2022 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: Chaitanya >Assignee: Chaitanya > > Primary functionality of this module is copy files into S3 bucket using > block-by-block approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-2022) S3 Output Module for file copy
[ https://issues.apache.org/jira/browse/APEXMALHAR-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708481#comment-15708481 ] ASF GitHub Bot commented on APEXMALHAR-2022: GitHub user chaithu14 reopened a pull request: https://github.com/apache/apex-malhar/pull/483 APEXMALHAR-2022 Developed S3 Output Module You can merge this pull request into a Git repository by running: $ git pull https://github.com/chaithu14/incubator-apex-malhar APEXMALHAR-2022-S3Output-multiPart Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/483.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #483 commit a5e8fa3facca750f5d7402c2c29e7cbabe53bd9e Author: chaitanyaDate: 2016-11-30T05:17:36Z APEXMALHAR-2022 Development of S3 Output Module > S3 Output Module for file copy > -- > > Key: APEXMALHAR-2022 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2022 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: Chaitanya >Assignee: Chaitanya > > Primary functionality of this module is copy files into S3 bucket using > block-by-block approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #483: APEXMALHAR-2022 Developed S3 Output Module
GitHub user chaithu14 reopened a pull request: https://github.com/apache/apex-malhar/pull/483 APEXMALHAR-2022 Developed S3 Output Module You can merge this pull request into a Git repository by running: $ git pull https://github.com/chaithu14/incubator-apex-malhar APEXMALHAR-2022-S3Output-multiPart Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/483.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #483 commit a5e8fa3facca750f5d7402c2c29e7cbabe53bd9e Author: chaitanyaDate: 2016-11-30T05:17:36Z APEXMALHAR-2022 Development of S3 Output Module --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #483: APEXMALHAR-2022 Developed S3 Output Module
Github user chaithu14 closed the pull request at: https://github.com/apache/apex-malhar/pull/483 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXMALHAR-2022) S3 Output Module for file copy
[ https://issues.apache.org/jira/browse/APEXMALHAR-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708392#comment-15708392 ] ASF GitHub Bot commented on APEXMALHAR-2022: Github user chaithu14 closed the pull request at: https://github.com/apache/apex-malhar/pull/483 > S3 Output Module for file copy > -- > > Key: APEXMALHAR-2022 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2022 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: Chaitanya >Assignee: Chaitanya > > Primary functionality of this module is copy files into S3 bucket using > block-by-block approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #483: APEXMALHAR-2022 Developed S3 Output Module
GitHub user chaithu14 reopened a pull request: https://github.com/apache/apex-malhar/pull/483 APEXMALHAR-2022 Developed S3 Output Module You can merge this pull request into a Git repository by running: $ git pull https://github.com/chaithu14/incubator-apex-malhar APEXMALHAR-2022-S3Output-multiPart Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/483.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #483 commit a5e8fa3facca750f5d7402c2c29e7cbabe53bd9e Author: chaitanyaDate: 2016-11-30T05:17:36Z APEXMALHAR-2022 Development of S3 Output Module --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXMALHAR-2022) S3 Output Module for file copy
[ https://issues.apache.org/jira/browse/APEXMALHAR-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708393#comment-15708393 ] ASF GitHub Bot commented on APEXMALHAR-2022: GitHub user chaithu14 reopened a pull request: https://github.com/apache/apex-malhar/pull/483 APEXMALHAR-2022 Developed S3 Output Module You can merge this pull request into a Git repository by running: $ git pull https://github.com/chaithu14/incubator-apex-malhar APEXMALHAR-2022-S3Output-multiPart Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/483.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #483 commit a5e8fa3facca750f5d7402c2c29e7cbabe53bd9e Author: chaitanyaDate: 2016-11-30T05:17:36Z APEXMALHAR-2022 Development of S3 Output Module > S3 Output Module for file copy > -- > > Key: APEXMALHAR-2022 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2022 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: Chaitanya >Assignee: Chaitanya > > Primary functionality of this module is copy files into S3 bucket using > block-by-block approach. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #483: APEXMALHAR-2022 Developed S3 Output Module
Github user chaithu14 closed the pull request at: https://github.com/apache/apex-malhar/pull/483 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---