[jira] [Commented] (APEXCORE-536) Upgrade Hadoop dependency

2016-09-15 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493691#comment-15493691
 ] 

Thomas Weise commented on APEXCORE-536:
---

We do have to worry about backward compatibility. I think it is safe to move to 
2.6 but this is important enough of a change to discuss and take a vote.  


> Upgrade Hadoop dependency
> -
>
> Key: APEXCORE-536
> URL: https://issues.apache.org/jira/browse/APEXCORE-536
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Thomas Weise
>  Labels: roadmap
>
> Currently Apex depends on Hadoop 2.2 and runs on all later 2.x version. 
> Hadoop 2.2 is quite old, most Apex users have more recent Hadoop installs. 
> Latest distro releases are based on 2.6 and 2.7. There are several important 
> features that were added in Hadoop since 2.2 that Apex should be able to 
> leverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-536) Upgrade Hadoop dependency

2016-09-15 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493568#comment-15493568
 ] 

Thomas Weise commented on APEXCORE-536:
---

http://mail-archives.apache.org/mod_mbox/apex-dev/201607.mbox/%3CCAMqituOH+AmH=-lpjndtkcyu96wej9hgek+6wfqevt-sqpf...@mail.gmail.com%3E


> Upgrade Hadoop dependency
> -
>
> Key: APEXCORE-536
> URL: https://issues.apache.org/jira/browse/APEXCORE-536
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Thomas Weise
>  Labels: roadmap
>
> Currently Apex depends on Hadoop 2.2 and runs on all later 2.x version. 
> Hadoop 2.2 is quite old, most Apex users have more recent Hadoop installs. 
> Latest distro releases are based on 2.6 and 2.7. There are several important 
> features that were added in Hadoop since 2.2 that Apex should be able to 
> leverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-534) Improve setup instructions in contributor guidelines

2016-09-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493585#comment-15493585
 ] 

ASF GitHub Bot commented on APEXCORE-534:
-

Github user tweise commented on a diff in the pull request:

https://github.com/apache/apex-site/pull/50#discussion_r78986398
  
--- Diff: src/md/contributing.md ---
@@ -14,7 +14,38 @@ This project welcomes new contributors and invites 
everyone to participate. Our
 
 People that help with the project in any of the above categories or other 
ways are contributors. See the 
[roles](http://www.apache.org/foundation/how-it-works.html#roles) as defined by 
the ASF. Community members that make sustained, welcome contributions to the 
project may be invited to become a [committer](/people.html). 
 
-## Code Style
+## Before coding: One time Setup
+
+### JIRA
+
+Apache JIRA is used for issue tracking. If you do not already have an 
Apache JIRA account, sign up [here](https://issues.apache.org/jira/). Note that 
the user name should have no white spaces or other special characters that 
complicate auto-completion within JIRA comments etc. 
+
+Please use a single JIRA account only (don't create multiple with 
different email addresses) to retain the issue history. Please use a permanent 
email address, for an existing account it can be changed in the profile. If you 
absolutely have to change your user name, contact INFRA.
+
+Apex has 2 JIRA projects:
+
+1. [APEXCORE](https://issues.apache.org/jira/browse/APEXCORE/) for 
[apex-core](https://github.com/apache/apex-core) and 
[apex-site](https://github.com/apache/apex-site)
+2. [APEXMALHAR](https://issues.apache.org/jira/browse/APEXMALHAR/) for 
[apex-malhar](https://github.com/apache/apex-malhar)
+
+Before working on changes for any of the repositories, please locate an 
existing JIRA ticket or submit a new one. In order to assign an issue to 
yourself, you need to be listed as contributor in the JIRA project. PMC members 
have access to add new contributors, please request to be added through a 
comment on the candidate ticket.
--- End diff --

I considered that but if you are looking at a ticket, then it is easier to 
do it from there and it will result in an email to the dev list.


> Improve setup instructions in contributor guidelines
> 
>
> Key: APEXCORE-534
> URL: https://issues.apache.org/jira/browse/APEXCORE-534
> Project: Apache Apex Core
>  Issue Type: Task
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>
> Improve instructions for newcomers to get started contributing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-site pull request #50: APEXCORE-534 Improve contributor setup instructi...

2016-09-15 Thread chinmaykolhatkar
Github user chinmaykolhatkar commented on a diff in the pull request:

https://github.com/apache/apex-site/pull/50#discussion_r78988135
  
--- Diff: src/md/contributing.md ---
@@ -14,7 +14,38 @@ This project welcomes new contributors and invites 
everyone to participate. Our
 
 People that help with the project in any of the above categories or other 
ways are contributors. See the 
[roles](http://www.apache.org/foundation/how-it-works.html#roles) as defined by 
the ASF. Community members that make sustained, welcome contributions to the 
project may be invited to become a [committer](/people.html). 
 
-## Code Style
+## Before coding: One time Setup
+
+### JIRA
+
+Apache JIRA is used for issue tracking. If you do not already have an 
Apache JIRA account, sign up [here](https://issues.apache.org/jira/). Note that 
the user name should have no white spaces or other special characters that 
complicate auto-completion within JIRA comments etc. 
+
+Please use a single JIRA account only (don't create multiple with 
different email addresses) to retain the issue history. Please use a permanent 
email address, for an existing account it can be changed in the profile. If you 
absolutely have to change your user name, contact INFRA.
+
+Apex has 2 JIRA projects:
+
+1. [APEXCORE](https://issues.apache.org/jira/browse/APEXCORE/) for 
[apex-core](https://github.com/apache/apex-core) and 
[apex-site](https://github.com/apache/apex-site)
+2. [APEXMALHAR](https://issues.apache.org/jira/browse/APEXMALHAR/) for 
[apex-malhar](https://github.com/apache/apex-malhar)
+
+Before working on changes for any of the repositories, please locate an 
existing JIRA ticket or submit a new one. In order to assign an issue to 
yourself, you need to be listed as contributor in the JIRA project. PMC members 
have access to add new contributors, please request to be added through a 
comment on the candidate ticket.
+
+### Github and git
+
+We use GitHub’s pull request functionality to review proposed code 
changes. If you do not already have a personal GitHub account, sign up 
[here](https://github.com/join). We recommend that you use the same email 
address and first/lastname for emails, git and JIRA so that contributions can 
be better tracked and notifications correlated. It is also recommended that you 
use an email address that is valid permanently (for example the @apache.org 
address, if you have one). Please also see:
+
+* https://help.github.com/articles/setting-your-email-in-git/
+* 
https://help.github.com/articles/adding-an-email-address-to-your-github-account/
+* https://help.github.com/articles/keeping-your-email-address-private/
+
+The ASF Apex git repositories have mirror repositories on github which are 
used to review pull requests and provide a second remote endpoint for the 
codebase.
+
+1. Fork the ASF github mirror: https://github.com/apache/apex-core (or 
https://github.com/apache/apex-malhar or https://github.com/apache/apex-site) 
+1. Clone the **fork** on your local workspace (one time step):  
--- End diff --

I guess we can add a link to git tutorial (https://try.github.io/). For 
those who had been non-git users till now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXCORE-534) Improve setup instructions in contributor guidelines

2016-09-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493611#comment-15493611
 ] 

ASF GitHub Bot commented on APEXCORE-534:
-

Github user chinmaykolhatkar commented on a diff in the pull request:

https://github.com/apache/apex-site/pull/50#discussion_r78988135
  
--- Diff: src/md/contributing.md ---
@@ -14,7 +14,38 @@ This project welcomes new contributors and invites 
everyone to participate. Our
 
 People that help with the project in any of the above categories or other 
ways are contributors. See the 
[roles](http://www.apache.org/foundation/how-it-works.html#roles) as defined by 
the ASF. Community members that make sustained, welcome contributions to the 
project may be invited to become a [committer](/people.html). 
 
-## Code Style
+## Before coding: One time Setup
+
+### JIRA
+
+Apache JIRA is used for issue tracking. If you do not already have an 
Apache JIRA account, sign up [here](https://issues.apache.org/jira/). Note that 
the user name should have no white spaces or other special characters that 
complicate auto-completion within JIRA comments etc. 
+
+Please use a single JIRA account only (don't create multiple with 
different email addresses) to retain the issue history. Please use a permanent 
email address, for an existing account it can be changed in the profile. If you 
absolutely have to change your user name, contact INFRA.
+
+Apex has 2 JIRA projects:
+
+1. [APEXCORE](https://issues.apache.org/jira/browse/APEXCORE/) for 
[apex-core](https://github.com/apache/apex-core) and 
[apex-site](https://github.com/apache/apex-site)
+2. [APEXMALHAR](https://issues.apache.org/jira/browse/APEXMALHAR/) for 
[apex-malhar](https://github.com/apache/apex-malhar)
+
+Before working on changes for any of the repositories, please locate an 
existing JIRA ticket or submit a new one. In order to assign an issue to 
yourself, you need to be listed as contributor in the JIRA project. PMC members 
have access to add new contributors, please request to be added through a 
comment on the candidate ticket.
+
+### Github and git
+
+We use GitHub’s pull request functionality to review proposed code 
changes. If you do not already have a personal GitHub account, sign up 
[here](https://github.com/join). We recommend that you use the same email 
address and first/lastname for emails, git and JIRA so that contributions can 
be better tracked and notifications correlated. It is also recommended that you 
use an email address that is valid permanently (for example the @apache.org 
address, if you have one). Please also see:
+
+* https://help.github.com/articles/setting-your-email-in-git/
+* 
https://help.github.com/articles/adding-an-email-address-to-your-github-account/
+* https://help.github.com/articles/keeping-your-email-address-private/
+
+The ASF Apex git repositories have mirror repositories on github which are 
used to review pull requests and provide a second remote endpoint for the 
codebase.
+
+1. Fork the ASF github mirror: https://github.com/apache/apex-core (or 
https://github.com/apache/apex-malhar or https://github.com/apache/apex-site) 
+1. Clone the **fork** on your local workspace (one time step):  
--- End diff --

I guess we can add a link to git tutorial (https://try.github.io/). For 
those who had been non-git users till now.


> Improve setup instructions in contributor guidelines
> 
>
> Key: APEXCORE-534
> URL: https://issues.apache.org/jira/browse/APEXCORE-534
> Project: Apache Apex Core
>  Issue Type: Task
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>
> Improve instructions for newcomers to get started contributing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-site pull request #50: APEXCORE-534 Improve contributor setup instructi...

2016-09-15 Thread chinmaykolhatkar
Github user chinmaykolhatkar commented on a diff in the pull request:

https://github.com/apache/apex-site/pull/50#discussion_r78984813
  
--- Diff: src/md/contributing.md ---
@@ -14,7 +14,38 @@ This project welcomes new contributors and invites 
everyone to participate. Our
 
 People that help with the project in any of the above categories or other 
ways are contributors. See the 
[roles](http://www.apache.org/foundation/how-it-works.html#roles) as defined by 
the ASF. Community members that make sustained, welcome contributions to the 
project may be invited to become a [committer](/people.html). 
 
-## Code Style
+## Before coding: One time Setup
+
+### JIRA
+
+Apache JIRA is used for issue tracking. If you do not already have an 
Apache JIRA account, sign up [here](https://issues.apache.org/jira/). Note that 
the user name should have no white spaces or other special characters that 
complicate auto-completion within JIRA comments etc. 
+
+Please use a single JIRA account only (don't create multiple with 
different email addresses) to retain the issue history. Please use a permanent 
email address, for an existing account it can be changed in the profile. If you 
absolutely have to change your user name, contact INFRA.
+
+Apex has 2 JIRA projects:
+
+1. [APEXCORE](https://issues.apache.org/jira/browse/APEXCORE/) for 
[apex-core](https://github.com/apache/apex-core) and 
[apex-site](https://github.com/apache/apex-site)
+2. [APEXMALHAR](https://issues.apache.org/jira/browse/APEXMALHAR/) for 
[apex-malhar](https://github.com/apache/apex-malhar)
+
+Before working on changes for any of the repositories, please locate an 
existing JIRA ticket or submit a new one. In order to assign an issue to 
yourself, you need to be listed as contributor in the JIRA project. PMC members 
have access to add new contributors, please request to be added through a 
comment on the candidate ticket.
--- End diff --

Rather than adding a comment on tkt, should we ask new contributor to send 
a mail to dev@apex for contributor access. 
That way, the PMC/Committers can follow up with new contributor and help 
the person if required. This way, he/she gets introduced to the community at 
the same time, existing community members gets chance to give new contributers 
next step.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-site pull request #50: APEXCORE-534 Improve contributor setup instructi...

2016-09-15 Thread chinmaykolhatkar
Github user chinmaykolhatkar commented on a diff in the pull request:

https://github.com/apache/apex-site/pull/50#discussion_r78987144
  
--- Diff: src/md/contributing.md ---
@@ -14,7 +14,38 @@ This project welcomes new contributors and invites 
everyone to participate. Our
 
 People that help with the project in any of the above categories or other 
ways are contributors. See the 
[roles](http://www.apache.org/foundation/how-it-works.html#roles) as defined by 
the ASF. Community members that make sustained, welcome contributions to the 
project may be invited to become a [committer](/people.html). 
 
-## Code Style
+## Before coding: One time Setup
+
+### JIRA
+
+Apache JIRA is used for issue tracking. If you do not already have an 
Apache JIRA account, sign up [here](https://issues.apache.org/jira/). Note that 
the user name should have no white spaces or other special characters that 
complicate auto-completion within JIRA comments etc. 
+
+Please use a single JIRA account only (don't create multiple with 
different email addresses) to retain the issue history. Please use a permanent 
email address, for an existing account it can be changed in the profile. If you 
absolutely have to change your user name, contact INFRA.
+
+Apex has 2 JIRA projects:
+
+1. [APEXCORE](https://issues.apache.org/jira/browse/APEXCORE/) for 
[apex-core](https://github.com/apache/apex-core) and 
[apex-site](https://github.com/apache/apex-site)
+2. [APEXMALHAR](https://issues.apache.org/jira/browse/APEXMALHAR/) for 
[apex-malhar](https://github.com/apache/apex-malhar)
+
+Before working on changes for any of the repositories, please locate an 
existing JIRA ticket or submit a new one. In order to assign an issue to 
yourself, you need to be listed as contributor in the JIRA project. PMC members 
have access to add new contributors, please request to be added through a 
comment on the candidate ticket.
+
+### Github and git
+
+We use GitHub’s pull request functionality to review proposed code 
changes. If you do not already have a personal GitHub account, sign up 
[here](https://github.com/join). We recommend that you use the same email 
address and first/lastname for emails, git and JIRA so that contributions can 
be better tracked and notifications correlated. It is also recommended that you 
use an email address that is valid permanently (for example the @apache.org 
address, if you have one). Please also see:
--- End diff --

In sentence "for example ." Only committers will apache email address. 
That's not encouraging IMO... Rather give example as "@gmail.com"... gmail.com 
will most likely be personal id and not company address.. That's added 
advantage. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXCORE-534) Improve setup instructions in contributor guidelines

2016-09-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493596#comment-15493596
 ] 

ASF GitHub Bot commented on APEXCORE-534:
-

Github user chinmaykolhatkar commented on a diff in the pull request:

https://github.com/apache/apex-site/pull/50#discussion_r78987144
  
--- Diff: src/md/contributing.md ---
@@ -14,7 +14,38 @@ This project welcomes new contributors and invites 
everyone to participate. Our
 
 People that help with the project in any of the above categories or other 
ways are contributors. See the 
[roles](http://www.apache.org/foundation/how-it-works.html#roles) as defined by 
the ASF. Community members that make sustained, welcome contributions to the 
project may be invited to become a [committer](/people.html). 
 
-## Code Style
+## Before coding: One time Setup
+
+### JIRA
+
+Apache JIRA is used for issue tracking. If you do not already have an 
Apache JIRA account, sign up [here](https://issues.apache.org/jira/). Note that 
the user name should have no white spaces or other special characters that 
complicate auto-completion within JIRA comments etc. 
+
+Please use a single JIRA account only (don't create multiple with 
different email addresses) to retain the issue history. Please use a permanent 
email address, for an existing account it can be changed in the profile. If you 
absolutely have to change your user name, contact INFRA.
+
+Apex has 2 JIRA projects:
+
+1. [APEXCORE](https://issues.apache.org/jira/browse/APEXCORE/) for 
[apex-core](https://github.com/apache/apex-core) and 
[apex-site](https://github.com/apache/apex-site)
+2. [APEXMALHAR](https://issues.apache.org/jira/browse/APEXMALHAR/) for 
[apex-malhar](https://github.com/apache/apex-malhar)
+
+Before working on changes for any of the repositories, please locate an 
existing JIRA ticket or submit a new one. In order to assign an issue to 
yourself, you need to be listed as contributor in the JIRA project. PMC members 
have access to add new contributors, please request to be added through a 
comment on the candidate ticket.
+
+### Github and git
+
+We use GitHub’s pull request functionality to review proposed code 
changes. If you do not already have a personal GitHub account, sign up 
[here](https://github.com/join). We recommend that you use the same email 
address and first/lastname for emails, git and JIRA so that contributions can 
be better tracked and notifications correlated. It is also recommended that you 
use an email address that is valid permanently (for example the @apache.org 
address, if you have one). Please also see:
--- End diff --

In sentence "for example ." Only committers will apache email address. 
That's not encouraging IMO... Rather give example as "@gmail.com"... gmail.com 
will most likely be personal id and not company address.. That's added 
advantage. :)


> Improve setup instructions in contributor guidelines
> 
>
> Key: APEXCORE-534
> URL: https://issues.apache.org/jira/browse/APEXCORE-534
> Project: Apache Apex Core
>  Issue Type: Task
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>
> Improve instructions for newcomers to get started contributing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #413: APEXMALHAR-2230 simplify the kafka input oper...

2016-09-15 Thread siyuanh
GitHub user siyuanh opened a pull request:

https://github.com/apache/apex-malhar/pull/413

APEXMALHAR-2230 simplify the kafka input operator test

OK, this change doesn't fix the intermittent error directly, but it 
simplifies the test to make it easier to debug in the future

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/siyuanh/apex-malhar APEXMALHAR-2230

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/413.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #413


commit 5909dfdc491fdca0cea7eca56fe72b8e1d32bcc3
Author: Siyuan Hua 
Date:   2016-09-15T15:31:34Z

APEXMALHAR-2230 simplify the kafka input operator test




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2230) Intermittent test failure in Kafka module

2016-09-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493687#comment-15493687
 ] 

ASF GitHub Bot commented on APEXMALHAR-2230:


GitHub user siyuanh opened a pull request:

https://github.com/apache/apex-malhar/pull/413

APEXMALHAR-2230 simplify the kafka input operator test

OK, this change doesn't fix the intermittent error directly, but it 
simplifies the test to make it easier to debug in the future

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/siyuanh/apex-malhar APEXMALHAR-2230

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/413.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #413


commit 5909dfdc491fdca0cea7eca56fe72b8e1d32bcc3
Author: Siyuan Hua 
Date:   2016-09-15T15:31:34Z

APEXMALHAR-2230 simplify the kafka input operator test




> Intermittent test failure in Kafka module
> -
>
> Key: APEXMALHAR-2230
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2230
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Thomas Weise
>Assignee: Siyuan Hua
>
> Test fails intermittently in Travis CI. Could be a race condition in the test?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-534) Improve setup instructions in contributor guidelines

2016-09-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493564#comment-15493564
 ] 

ASF GitHub Bot commented on APEXCORE-534:
-

Github user chinmaykolhatkar commented on a diff in the pull request:

https://github.com/apache/apex-site/pull/50#discussion_r78984813
  
--- Diff: src/md/contributing.md ---
@@ -14,7 +14,38 @@ This project welcomes new contributors and invites 
everyone to participate. Our
 
 People that help with the project in any of the above categories or other 
ways are contributors. See the 
[roles](http://www.apache.org/foundation/how-it-works.html#roles) as defined by 
the ASF. Community members that make sustained, welcome contributions to the 
project may be invited to become a [committer](/people.html). 
 
-## Code Style
+## Before coding: One time Setup
+
+### JIRA
+
+Apache JIRA is used for issue tracking. If you do not already have an 
Apache JIRA account, sign up [here](https://issues.apache.org/jira/). Note that 
the user name should have no white spaces or other special characters that 
complicate auto-completion within JIRA comments etc. 
+
+Please use a single JIRA account only (don't create multiple with 
different email addresses) to retain the issue history. Please use a permanent 
email address, for an existing account it can be changed in the profile. If you 
absolutely have to change your user name, contact INFRA.
+
+Apex has 2 JIRA projects:
+
+1. [APEXCORE](https://issues.apache.org/jira/browse/APEXCORE/) for 
[apex-core](https://github.com/apache/apex-core) and 
[apex-site](https://github.com/apache/apex-site)
+2. [APEXMALHAR](https://issues.apache.org/jira/browse/APEXMALHAR/) for 
[apex-malhar](https://github.com/apache/apex-malhar)
+
+Before working on changes for any of the repositories, please locate an 
existing JIRA ticket or submit a new one. In order to assign an issue to 
yourself, you need to be listed as contributor in the JIRA project. PMC members 
have access to add new contributors, please request to be added through a 
comment on the candidate ticket.
--- End diff --

Rather than adding a comment on tkt, should we ask new contributor to send 
a mail to dev@apex for contributor access. 
That way, the PMC/Committers can follow up with new contributor and help 
the person if required. This way, he/she gets introduced to the community at 
the same time, existing community members gets chance to give new contributers 
next step.


> Improve setup instructions in contributor guidelines
> 
>
> Key: APEXCORE-534
> URL: https://issues.apache.org/jira/browse/APEXCORE-534
> Project: Apache Apex Core
>  Issue Type: Task
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>
> Improve instructions for newcomers to get started contributing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-site pull request #50: APEXCORE-534 Improve contributor setup instructi...

2016-09-15 Thread tweise
Github user tweise commented on a diff in the pull request:

https://github.com/apache/apex-site/pull/50#discussion_r78986398
  
--- Diff: src/md/contributing.md ---
@@ -14,7 +14,38 @@ This project welcomes new contributors and invites 
everyone to participate. Our
 
 People that help with the project in any of the above categories or other 
ways are contributors. See the 
[roles](http://www.apache.org/foundation/how-it-works.html#roles) as defined by 
the ASF. Community members that make sustained, welcome contributions to the 
project may be invited to become a [committer](/people.html). 
 
-## Code Style
+## Before coding: One time Setup
+
+### JIRA
+
+Apache JIRA is used for issue tracking. If you do not already have an 
Apache JIRA account, sign up [here](https://issues.apache.org/jira/). Note that 
the user name should have no white spaces or other special characters that 
complicate auto-completion within JIRA comments etc. 
+
+Please use a single JIRA account only (don't create multiple with 
different email addresses) to retain the issue history. Please use a permanent 
email address, for an existing account it can be changed in the profile. If you 
absolutely have to change your user name, contact INFRA.
+
+Apex has 2 JIRA projects:
+
+1. [APEXCORE](https://issues.apache.org/jira/browse/APEXCORE/) for 
[apex-core](https://github.com/apache/apex-core) and 
[apex-site](https://github.com/apache/apex-site)
+2. [APEXMALHAR](https://issues.apache.org/jira/browse/APEXMALHAR/) for 
[apex-malhar](https://github.com/apache/apex-malhar)
+
+Before working on changes for any of the repositories, please locate an 
existing JIRA ticket or submit a new one. In order to assign an issue to 
yourself, you need to be listed as contributor in the JIRA project. PMC members 
have access to add new contributors, please request to be added through a 
comment on the candidate ticket.
--- End diff --

I considered that but if you are looking at a ticket, then it is easier to 
do it from there and it will result in an email to the dev list.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-core pull request #388: APEXCORE-527 - Minor changes in LocalStramChild...

2016-09-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-core/pull/388


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (APEXCORE-527) Minor changes in LocalStramChildLauncher to help with unit test failures

2016-09-15 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise resolved APEXCORE-527.
---
   Resolution: Fixed
Fix Version/s: 3.5.0

> Minor changes in LocalStramChildLauncher to help with unit test failures
> 
>
> Key: APEXCORE-527
> URL: https://issues.apache.org/jira/browse/APEXCORE-527
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
> Fix For: 3.5.0
>
>
> - Catch Error in addition to Exception, so both are properly logged
> - Update childContainers in the container thread



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-535) Node.teardown() should try to gracefully shutdown exectutor service

2016-09-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494177#comment-15494177
 ] 

ASF GitHub Bot commented on APEXCORE-535:
-

Github user asfgit closed the pull request at:

https://github.com/apache/apex-core/pull/392


> Node.teardown() should try to gracefully shutdown exectutor service
> ---
>
> Key: APEXCORE-535
> URL: https://issues.apache.org/jira/browse/APEXCORE-535
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
> Fix For: 3.5.0
>
>
> Forceful shutdown of the executor service leads to InterruptedException if 
> asynchronous checkpointing is in progress:
> {noformat}
> java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at java.lang.Thread.join(Thread.java:1281)
> at java.lang.Thread.join(Thread.java:1355)
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:456)
> at org.apache.hadoop.util.Shell.run(Shell.java:379)
> at 
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
> at org.apache.hadoop.util.Shell.execCommand(Shell.java:678)
> at org.apache.hadoop.util.Shell.execCommand(Shell.java:661)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639)
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:305)
> at 
> org.apache.hadoop.fs.FileSystem.primitiveCreate(FileSystem.java:1011)
> at 
> org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:85)
> at 
> org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.(ChecksumFs.java:344)
> at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:390)
> at 
> org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:575)
> at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:676)
> at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:672)
> at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
> at org.apache.hadoop.fs.FileContext.create(FileContext.java:672)
> at 
> com.datatorrent.common.util.AsyncFSStorageAgent.copyToHDFS(AsyncFSStorageAgent.java:118)
> at 
> com.datatorrent.stram.engine.Node$CheckpointHandler.call(Node.java:667)
> at 
> com.datatorrent.stram.engine.Node$CheckpointHandler.call(Node.java:656)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 2016-09-14 22:02:51,463 [Thread-2109] WARN  util.Shell run - Error reading 
> the error stream
> java.io.IOException: Stream closed
> at java.io.BufferedReader.ensureOpen(BufferedReader.java:115)
> at java.io.BufferedReader.readLine(BufferedReader.java:310)
> at java.io.BufferedReader.readLine(BufferedReader.java:382)
> at org.apache.hadoop.util.Shell$1.run(Shell.java:431)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-core pull request #389: APEXCORE-531 - Enable System.out/System.err che...

2016-09-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-core/pull/389


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (APEXCORE-531) Enable System.out/System.err check for *Test

2016-09-15 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise resolved APEXCORE-531.
---
   Resolution: Fixed
Fix Version/s: 3.5.0

> Enable System.out/System.err check for *Test
> 
>
> Key: APEXCORE-531
> URL: https://issues.apache.org/jira/browse/APEXCORE-531
> Project: Apache Apex Core
>  Issue Type: Task
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2240) Implement Windowed Join Operator

2016-09-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494803#comment-15494803
 ] 

ASF GitHub Bot commented on APEXMALHAR-2240:


GitHub user ShunxinLu opened a pull request:

https://github.com/apache/apex-malhar/pull/414

REVIEW ONLY: APEXMALHAR-2240 Implement Windowed Join Operator

@davidyan74 @tweise @siyuanh This is the first try of implementing join 
support for windowed stream. This PR is for review purpose only, please ignore 
all check style errors. 
The plan is `WindowedJoinOperator` should support joining up to 5 streams, 
but for now it only join 2 streams to verify the implementation. 
`KeyedWindowedJoinOperatorImpl` now contains a lot of duplicate code from 
`KeyWindowedOperatorImpl`, any suggestion on how to eliminate those 
duplications will be appreciated. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ShunxinLu/apex-malhar new_join

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/414.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #414


commit 88b3625f8188caf944d66e430d8278951c078ebc
Author: Shunxin 
Date:   2016-09-15T22:48:42Z

APEXMALHAR-2240 Implement Windowed Join Operator




> Implement Windowed Join Operator
> 
>
> Key: APEXMALHAR-2240
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2240
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: Shunxin Lu
>Assignee: Shunxin Lu
>
> Implement windowed join operator that supports join aggregation. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2239) Null pointer exception in JDBCInputOperator when the field names do not match in Pojo, table

2016-09-15 Thread Venkatesh Kottapalli (JIRA)
Venkatesh Kottapalli created APEXMALHAR-2239:


 Summary: Null pointer exception in JDBCInputOperator when the 
field names do not match in Pojo, table
 Key: APEXMALHAR-2239
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2239
 Project: Apache Apex Malhar
  Issue Type: Bug
Affects Versions: 3.5.0
Reporter: Venkatesh Kottapalli
Priority: Minor


The following exception is thrown when the field names for the table provided 
in the FieldInfo mapping don't match with the column names of the table.

* Exception: *
Abandoning deployment due to setup failure. java.lang.NullPointerException

 at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:366)

 at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:67)

 at com.datatorrent.stram.engine.Node.activate(Node.java:619)

 at 
com.datatorrent.stram.engine.StreamingContainer.setupNode(StreamingContainer.java:1336)

 at 
com.datatorrent.stram.engine.StreamingContainer.access$100(StreamingContainer.java:130)

 at 
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1388)






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2239) Null pointer exception in JDBCInputOperator when the field names do not match in Pojo, table

2016-09-15 Thread Venkatesh Kottapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesh Kottapalli updated APEXMALHAR-2239:
-
Description: 
The following exception is thrown when the field names for the table provided 
in the FieldInfo mapping don't match with the column names of the table.

Exception: 
Abandoning deployment due to setup failure. java.lang.NullPointerException

 at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:366)

 at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:67)

 at com.datatorrent.stram.engine.Node.activate(Node.java:619)

 at 
com.datatorrent.stram.engine.StreamingContainer.setupNode(StreamingContainer.java:1336)

 at 
com.datatorrent.stram.engine.StreamingContainer.access$100(StreamingContainer.java:130)

 at 
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1388)




  was:
The following exception is thrown when the field names for the table provided 
in the FieldInfo mapping don't match with the column names of the table.

* Exception: *
Abandoning deployment due to setup failure. java.lang.NullPointerException

 at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:366)

 at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:67)

 at com.datatorrent.stram.engine.Node.activate(Node.java:619)

 at 
com.datatorrent.stram.engine.StreamingContainer.setupNode(StreamingContainer.java:1336)

 at 
com.datatorrent.stram.engine.StreamingContainer.access$100(StreamingContainer.java:130)

 at 
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1388)





> Null pointer exception in JDBCInputOperator when the field names do not match 
> in Pojo, table
> 
>
> Key: APEXMALHAR-2239
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2239
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Venkatesh Kottapalli
>Priority: Minor
>
> The following exception is thrown when the field names for the table provided 
> in the FieldInfo mapping don't match with the column names of the table.
> Exception: 
> Abandoning deployment due to setup failure. java.lang.NullPointerException
>  at 
> com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:366)
>  at 
> com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:67)
>  at com.datatorrent.stram.engine.Node.activate(Node.java:619)
>  at 
> com.datatorrent.stram.engine.StreamingContainer.setupNode(StreamingContainer.java:1336)
>  at 
> com.datatorrent.stram.engine.StreamingContainer.access$100(StreamingContainer.java:130)
>  at 
> com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1388)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2239) Null pointer exception in JDBCPojoInputOperator when the field names do not match in Pojo, table

2016-09-15 Thread Venkatesh Kottapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesh Kottapalli updated APEXMALHAR-2239:
-
Summary: Null pointer exception in JDBCPojoInputOperator when the field 
names do not match in Pojo, table  (was: Null pointer exception in 
JDBCInputOperator when the field names do not match in Pojo, table)

> Null pointer exception in JDBCPojoInputOperator when the field names do not 
> match in Pojo, table
> 
>
> Key: APEXMALHAR-2239
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2239
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Venkatesh Kottapalli
>Priority: Minor
>
> The following exception is thrown when the field names for the table provided 
> in the FieldInfo mapping don't match with the column names of the table.
> Exception: 
> Abandoning deployment due to setup failure. java.lang.NullPointerException
>  at 
> com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:366)
>  at 
> com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:67)
>  at com.datatorrent.stram.engine.Node.activate(Node.java:619)
>  at 
> com.datatorrent.stram.engine.StreamingContainer.setupNode(StreamingContainer.java:1336)
>  at 
> com.datatorrent.stram.engine.StreamingContainer.access$100(StreamingContainer.java:130)
>  at 
> com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1388)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2239) Null pointer exception in JDBCPojoInputOperator when the field names do not match in Pojo, table

2016-09-15 Thread Venkatesh Kottapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkatesh Kottapalli updated APEXMALHAR-2239:
-
Description: 
The following exception is thrown in JDBCPojoInputOperator when the field names 
for the table provided in the FieldInfo mapping don't match with the column 
names of the table.

Exception: 
Abandoning deployment due to setup failure. java.lang.NullPointerException

 at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:366)

 at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:67)

 at com.datatorrent.stram.engine.Node.activate(Node.java:619)

 at 
com.datatorrent.stram.engine.StreamingContainer.setupNode(StreamingContainer.java:1336)

 at 
com.datatorrent.stram.engine.StreamingContainer.access$100(StreamingContainer.java:130)

 at 
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1388)




  was:
The following exception is thrown when the field names for the table provided 
in the FieldInfo mapping don't match with the column names of the table.

Exception: 
Abandoning deployment due to setup failure. java.lang.NullPointerException

 at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:366)

 at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:67)

 at com.datatorrent.stram.engine.Node.activate(Node.java:619)

 at 
com.datatorrent.stram.engine.StreamingContainer.setupNode(StreamingContainer.java:1336)

 at 
com.datatorrent.stram.engine.StreamingContainer.access$100(StreamingContainer.java:130)

 at 
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1388)





> Null pointer exception in JDBCPojoInputOperator when the field names do not 
> match in Pojo, table
> 
>
> Key: APEXMALHAR-2239
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2239
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Venkatesh Kottapalli
>Priority: Minor
>
> The following exception is thrown in JDBCPojoInputOperator when the field 
> names for the table provided in the FieldInfo mapping don't match with the 
> column names of the table.
> Exception: 
> Abandoning deployment due to setup failure. java.lang.NullPointerException
>  at 
> com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:366)
>  at 
> com.datatorrent.lib.db.jdbc.JdbcPOJOInputOperator.activate(JdbcPOJOInputOperator.java:67)
>  at com.datatorrent.stram.engine.Node.activate(Node.java:619)
>  at 
> com.datatorrent.stram.engine.StreamingContainer.setupNode(StreamingContainer.java:1336)
>  at 
> com.datatorrent.stram.engine.StreamingContainer.access$100(StreamingContainer.java:130)
>  at 
> com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1388)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2223) Managed state should parallelize WAL writes

2016-09-15 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494377#comment-15494377
 ] 

Thomas Weise commented on APEXMALHAR-2223:
--

Why delay writing to the WAL to endWindow? You get the best throughput by 
writing the data immediately (asynchronous), unless the same key is updated 
multiple times in a window.

The other question is when the WAL should be flushed (blocking). Does it need 
to be in endWindow for idempotency or otherwise beforeCheckpoint()?


> Managed state should parallelize WAL writes
> ---
>
> Key: APEXMALHAR-2223
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2223
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: Thomas Weise
>Assignee: Chandni Singh
>
> Currently, data is accumulated in memory and written to the WAL on checkpoint 
> only. This causes a write spike on checkpoint and does not utilize the HDFS 
> write pipeline. The other extreme is writing to the WAL as soon as data 
> arrives and then only flush in beforeCheckpoint. The downside of this is that 
> when the same key is written many times, all duplicates will be in the WAL. 
> Need to find a balances approach, that the user can potentially fine tune. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Python support

2016-09-15 Thread Thomas Weise
Hi,

Python (not Jython) seems to be a popular language and frequently used for
data analysis, especially where flexibility matters. It has a comprehensive
library and it is generally considered low barrier to entry. I have also
seen Python used in critical back-end components, although that's probably
not very common?

I think Python support could potentially expand the user base for Apex.
There are 2 main areas that can be considered:

1) Support to execute Python code through an operator
2) A client API that lets users construct pipelines in Python

The former can exist without the latter. And it would enable users to
leverage existing code that otherwise would have to be rewritten in a JVM
language. The engine could ship scripts/packages so they are automatically
distributed on the cluster.

A useful client API probably requires back-end support for lambda functions
and more complex UDFs.

Would be great to get some feedback, especially from those that have
experience with Python, on how an integration could potentially open up new
use cases for Apex.

Thanks,
Thomas