Re: Proposal: CompositeAccumulation for Windowed Operator

2017-02-27 Thread Chinmay Kolhatkar
Thanks Bright. I've reviewed your PR. It looks good.. Just a minor change required. Please see my comment there. On Mon, Feb 27, 2017 at 11:26 PM, Bright Chen wrote: > A jira created: https://issues.apache.org/jira/browse/APEXMALHAR-2428 > > > On Mon, Feb 27, 2017 at

Re: Maven build package error with 3.5

2017-02-27 Thread Sanjay Pujare
The root cause seems to be embedded in your output: Could not transfer artifact org.apache.apex:malhar-library:pom:3.5.0 from/to central ( https://repo.maven.apache.org/maven2): sun.security.validator.ValidatorException: PKIX path building failed:

Maven build package error with 3.5

2017-02-27 Thread Dongming Liang
It was running well with 3.4, but now failing with Apex 3.5 ➜ log-aggregator git:(apex-tcp) ✗ mvn package -DskipTests [INFO] Scanning for projects... [INFO] [INFO] [INFO] Building Aggregator 1.0-SNAPSHOT [INFO]

[jira] [Commented] (APEXMALHAR-2366) Apply BloomFilter to Bucket

2017-02-27 Thread bright chen (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886659#comment-15886659 ] bright chen commented on APEXMALHAR-2366: - Hi [~bhupesh] The only difference as I think is

Re: Java packages: legacy -> org.apache.apex

2017-02-27 Thread Vlad Rozov
Let's not be confused with open source == ASF, it is not. Not all open source projects are part of Apache. Majority of Apache projects do use "org.apache." package names. Thank you, Vlad //On 2/27/17 10:24, Sanjay Pujare wrote: +1 for bullet 1 assuming new code implies brand new classes

Re: Java packages: legacy -> org.apache.apex

2017-02-27 Thread Pramod Immaneni
For malhar, for existing operators, I prefer we do this as part of the planned refactoring for breaking the monolith modules into baby packages and would also prefer deprecating the existing operators in place. This will help us achieve two things. First, the user will see all the new changes at

Re: [DISCUSS] Proposal for adapting Malhar operators for batch use cases

2017-02-27 Thread David Yan
I now see your rationale on putting the filename in the window. As far as I understand, the reasons why the filename is not part of the key and the Global Window is not used are: 1) The files are processed in sequence, not in parallel 2) The windowed operator should not keep the state associated

[GitHub] apex-malhar pull request #566: APEXMALHAR-2428 CompositeAccumulation for win...

2017-02-27 Thread brightchen
GitHub user brightchen opened a pull request: https://github.com/apache/apex-malhar/pull/566 APEXMALHAR-2428 CompositeAccumulation for windowed operator You can merge this pull request into a Git repository by running: $ git pull https://github.com/brightchen/apex-malhar

[jira] [Commented] (APEXMALHAR-2428) CompositeAccumulation for windowed operator

2017-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886304#comment-15886304 ] ASF GitHub Bot commented on APEXMALHAR-2428: GitHub user brightchen opened a pull

[jira] [Resolved] (APEXMALHAR-2395) create MultiAccumulation

2017-02-27 Thread bright chen (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bright chen resolved APEXMALHAR-2395. - Resolution: Duplicate duplicate with

Re: Java packages: legacy -> org.apache.apex

2017-02-27 Thread Sanjay Pujare
+1 for bullet 1 assuming new code implies brand new classes (since it doesn't involve any backward compatibility issues). We can always review contributor PRs to make sure new code is added with new package naming guidelines. But for 2 and 3 I have a question/comment: is there even a need to do

Re: Proposal: CompositeAccumulation for Windowed Operator

2017-02-27 Thread Bright Chen
A jira created: https://issues.apache.org/jira/browse/APEXMALHAR-2428 On Mon, Feb 27, 2017 at 9:53 AM, Bright Chen wrote: > I think Chimay's proposal could make application more clear and increase > the performance as locate of key/window cost most of time. > > A

[jira] [Created] (APEXMALHAR-2428) CompositeAccumulation for windowed operator

2017-02-27 Thread bright chen (JIRA)
bright chen created APEXMALHAR-2428: --- Summary: CompositeAccumulation for windowed operator Key: APEXMALHAR-2428 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2428 Project: Apache Apex Malhar

Re: Proposal: CompositeAccumulation for Windowed Operator

2017-02-27 Thread Bright Chen
I think Chimay's proposal could make application more clear and increase the performance as locate of key/window cost most of time. A suggested usage for Composite Accumulation could as following: *//following is the sample code how to add sub accumulations* *CompositeAccumulation

Re: [DISCUSS] Proposal for adapting Malhar operators for batch use cases

2017-02-27 Thread Thomas Weise
On Mon, Feb 27, 2017 at 8:50 AM, Bhupesh Chawda wrote: > I think my comments related to count based windows might be causing > confusion. Let's not discuss count based scenarios for now. > > Just want to make sure we are on the same page wrt. the "each file is a > batch"

Re: [DISCUSS] Proposal for adapting Malhar operators for batch use cases

2017-02-27 Thread Bhupesh Chawda
I think my comments related to count based windows might be causing confusion. Let's not discuss count based scenarios for now. Just want to make sure we are on the same page wrt. the "each file is a batch" use case. As mentioned by Thomas, the each tuple from the same file has the same timestamp

Re: Java packages: legacy -> org.apache.apex

2017-02-27 Thread Chinmay Kolhatkar
Thomas, I agree with you that we need this migration to be done but I have a different opinion on how to execute this. I think if we do this in phases as described above, users might end up in more confusion. For doing this migration, I think it should follow these steps: 1. Whether for operator

Re: [DISCUSS] Proposal for adapting Malhar operators for batch use cases

2017-02-27 Thread Thomas Weise
I don't think this is a use case for count based window. We have multiple files that are retrieved in a sequence and there is no knowledge of the number of records per file. The requirement is to aggregate each file separately and emit the aggregate when the file is read fully. There is no

Re: [DISCUSS] Proposal for adapting Malhar operators for batch use cases

2017-02-27 Thread David Yan
I don't think this is the way to go. Global Window only means the timestamp does not matter (or that there is no timestamp). It does not necessarily mean it's a large batch. Unless there is some notion of event time for each file, you don't want to embed the file into the window itself. If you

Java packages: legacy -> org.apache.apex

2017-02-27 Thread Thomas Weise
Hi, This topic has come up on several PRs and I think it warrants a broader discussion. At the time of incubation, the decision was to defer change of Java packages from com.datatorrent to org.apache.apex till next major release to ensure backward compatibility for users. Unfortunately that has

[jira] [Resolved] (APEXMALHAR-2424) Null pointer exception in JDBCPojoPollInputOperator is thrown when we set columnsExpression and have additional columns in fieldInfos

2017-02-27 Thread shubham pathak (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shubham pathak resolved APEXMALHAR-2424. Resolution: Fixed > Null pointer exception in JDBCPojoPollInputOperator is

[jira] [Updated] (APEXMALHAR-2424) Null pointer exception in JDBCPojoPollInputOperator is thrown when we set columnsExpression and have additional columns in fieldInfos

2017-02-27 Thread shubham pathak (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shubham pathak updated APEXMALHAR-2424: --- Fix Version/s: 3.7.0 > Null pointer exception in JDBCPojoPollInputOperator is

[jira] [Commented] (APEXMALHAR-2424) Null pointer exception in JDBCPojoPollInputOperator is thrown when we set columnsExpression and have additional columns in fieldInfos

2017-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885722#comment-15885722 ] ASF GitHub Bot commented on APEXMALHAR-2424: Github user asfgit closed the pull request

[GitHub] apex-malhar pull request #562: APEXMALHAR-2424 Extra null field getting adde...

2017-02-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/562 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[jira] [Resolved] (APEXMALHAR-2415) Enable PojoInnerJoin accum to allow multiple keys for join purpose

2017-02-27 Thread Tushar Gosavi (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tushar Gosavi resolved APEXMALHAR-2415. --- Resolution: Fixed Fix Version/s: 3.7.0 > Enable PojoInnerJoin accum to

[jira] [Commented] (APEXMALHAR-2415) Enable PojoInnerJoin accum to allow multiple keys for join purpose

2017-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885663#comment-15885663 ] ASF GitHub Bot commented on APEXMALHAR-2415: Github user asfgit closed the pull request

[GitHub] apex-malhar pull request #561: APEXMALHAR-2415 Taking join on multiple colum...

2017-02-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/561 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

Re: [DISCUSS] Proposal for adapting Malhar operators for batch use cases

2017-02-27 Thread Bhupesh Chawda
Hi David, Thanks for your comments. The wordcount example that I created based on the windowed operator does processing of word counts per file (each file as a separate batch), i.e. process counts for each file and dump into separate files. As I understand Global window is for one large batch;

[jira] [Resolved] (APEXMALHAR-2414) Improve performance of PojoInnerJoin accum by using PojoUtils

2017-02-27 Thread Tushar Gosavi (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tushar Gosavi resolved APEXMALHAR-2414. --- Resolution: Fixed Fix Version/s: 3.7.0 Changed pushed through

[jira] [Commented] (APEXMALHAR-2426) Add user document for RegexParser operator

2017-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885466#comment-15885466 ] ASF GitHub Bot commented on APEXMALHAR-2426: GitHub user venkateshkottapalli opened a

[GitHub] apex-malhar pull request #565: APEXMALHAR-2426 - RegexParser Documentation

2017-02-27 Thread venkateshkottapalli
GitHub user venkateshkottapalli opened a pull request: https://github.com/apache/apex-malhar/pull/565 APEXMALHAR-2426 - RegexParser Documentation APEXMALHAR-2426 : Regex Parser Documentation You can merge this pull request into a Git repository by running: $ git pull

[jira] [Resolved] (APEXMALHAR-2218) RegexParser- Operator to parse byte stream using Regex pattern and emit a POJO

2017-02-27 Thread Venkatesh Kottapalli (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkatesh Kottapalli resolved APEXMALHAR-2218. -- Resolution: Fixed Merged. > RegexParser- Operator to parse byte

[jira] [Updated] (APEXMALHAR-2218) RegexParser- Operator to parse byte stream using Regex pattern and emit a POJO

2017-02-27 Thread Venkatesh Kottapalli (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkatesh Kottapalli updated APEXMALHAR-2218: - Fix Version/s: 3.7.0 Issue Type: New Feature (was:

[jira] [Commented] (APEXMALHAR-2218) RegexParser- Operator to parse byte stream using Regex pattern and emit a POJO

2017-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885445#comment-15885445 ] ASF GitHub Bot commented on APEXMALHAR-2218: Github user asfgit closed the pull request

[GitHub] apex-malhar pull request #396: APEXMALHAR-2218-Creation of RegexSplitter ope...

2017-02-27 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/396 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[jira] [Commented] (APEXMALHAR-2427) Kinesis Input Operator documentation

2017-02-27 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15885440#comment-15885440 ] ASF GitHub Bot commented on APEXMALHAR-2427: GitHub user deepak-narkhede opened a pull