[jira] [Commented] (APEXMALHAR-2119) Make DirectoryScanner in AbstractFileInputOperator inheritance friendly.

2016-06-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331576#comment-15331576
 ] 

ASF GitHub Bot commented on APEXMALHAR-2119:


GitHub user tushargosavi opened a pull request:

https://github.com/apache/apex-malhar/pull/318

APEXMALHAR-2119 add setters for partition count and index.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tushargosavi/incubator-apex-malhar 
APEXMALHAR-2119

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/318.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #318


commit 82cfbbbcca80c4f6eb4d76cb35a6fb5402db0d58
Author: Tushar R. Gosavi 
Date:   2016-06-15T11:42:16Z

APEXMALHAR-2119 add setters for partition count and index.




> Make DirectoryScanner in AbstractFileInputOperator inheritance friendly. 
> -
>
> Key: APEXMALHAR-2119
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2119
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Tushar Gosavi
>Assignee: Tushar Gosavi
>
> The DirectoryScanner has partitionIndex and partitionCount  declared as 
> private without any setters. Inherited DirectoryScanner can not set them and 
> hence can not call most of the methods in DirectoryScanner which depends on 
> these fields (acceptFile). 
> Also new DirectoryScanner has to implement createPartition as default one 
> creates instance of DirectoryScanner by default.
> Make the class inheritance friendly by adding setters and use kryo clone in 
> createPartition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #318: APEXMALHAR-2119 add setters for partition cou...

2016-06-15 Thread tushargosavi
GitHub user tushargosavi opened a pull request:

https://github.com/apache/apex-malhar/pull/318

APEXMALHAR-2119 add setters for partition count and index.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tushargosavi/incubator-apex-malhar 
APEXMALHAR-2119

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/318.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #318


commit 82cfbbbcca80c4f6eb4d76cb35a6fb5402db0d58
Author: Tushar R. Gosavi 
Date:   2016-06-15T11:42:16Z

APEXMALHAR-2119 add setters for partition count and index.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2119) Make DirectoryScanner in AbstractFileInputOperator inheritance friendly.

2016-06-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331848#comment-15331848
 ] 

ASF GitHub Bot commented on APEXMALHAR-2119:


Github user amberarrow commented on a diff in the pull request:

https://github.com/apache/apex-malhar/pull/318#discussion_r67174126
  
--- Diff: 
library/src/main/java/com/datatorrent/lib/io/fs/AbstractFileInputOperator.java 
---
@@ -1056,10 +1056,15 @@ protected Pattern getRegex()
   return pathSet;
 }
 
+protected int getParition(String filePathStr)
--- End diff --

getParition => getPartition


> Make DirectoryScanner in AbstractFileInputOperator inheritance friendly. 
> -
>
> Key: APEXMALHAR-2119
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2119
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Tushar Gosavi
>Assignee: Tushar Gosavi
>
> The DirectoryScanner has partitionIndex and partitionCount  declared as 
> private without any setters. Inherited DirectoryScanner can not set them and 
> hence can not call most of the methods in DirectoryScanner which depends on 
> these fields (acceptFile). 
> Also new DirectoryScanner has to implement createPartition as default one 
> creates instance of DirectoryScanner by default.
> Make the class inheritance friendly by adding setters and use kryo clone in 
> createPartition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-386) Upgrade to Jackson 1.9.13

2016-06-15 Thread Tatu Saloranta (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15332484#comment-15332484
 ] 

Tatu Saloranta commented on APEXCORE-386:
-

I run the existing test suite, as no new functionality was added or changed 
within project; change only concerns maven depencency of the patch version of 
Jackson library dependency. Patch versions in Jackson, in turn, follow semantic 
versioning and only contain bug fixes.


> Upgrade to Jackson 1.9.13
> -
>
> Key: APEXCORE-386
> URL: https://issues.apache.org/jira/browse/APEXCORE-386
> Project: Apache Apex Core
>  Issue Type: Improvement
>Affects Versions: 3.3.0
>Reporter: Tatu Saloranta
>Assignee: Tatu Saloranta
>Priority: Minor
> Fix For: 3.5.0
>
>
> Looks like current version of Apex core depends on Jackson 1.9.2. There are 
> much more recent patches, latest being 1.9.13. It would make sense to upgrade 
> to that to avoid problems fixed with the patches; most OSS libraries that 
> rely on Jackson 1.x rely on that version (like hadoop-core, avro)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-15 Thread Thomas Weise
The Apache Apex PMC is pleased to announce that Siyuan Hua is now a PMC
member. We appreciate all his contributions to the project so far, and are
looking forward to more.

Welcome, Siyuan, and congratulations!
Thomas, for the Apache Apex PMC.


Re: [ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-15 Thread Ashwin Chandra Putta
Congratulations Siyuan!!
On Jun 15, 2016 9:26 PM, "Thomas Weise"  wrote:

> The Apache Apex PMC is pleased to announce that Siyuan Hua is now a PMC
> member. We appreciate all his contributions to the project so far, and are
> looking forward to more.
>
> Welcome, Siyuan, and congratulations!
> Thomas, for the Apache Apex PMC.
>


Re: [ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-15 Thread Teddy Rusli
Congrats Siyuan!

On Wed, Jun 15, 2016 at 9:28 PM, Ashwin Chandra Putta <
ashwinchand...@gmail.com> wrote:

> Congratulations Siyuan!!
> On Jun 15, 2016 9:26 PM, "Thomas Weise"  wrote:
>
> > The Apache Apex PMC is pleased to announce that Siyuan Hua is now a PMC
> > member. We appreciate all his contributions to the project so far, and
> are
> > looking forward to more.
> >
> > Welcome, Siyuan, and congratulations!
> > Thomas, for the Apache Apex PMC.
> >
>



-- 
Regards,

Teddy Rusli


Re: [ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-15 Thread Pradeep Kumbhar
Congratulations Siyuan!!

On Thu, Jun 16, 2016 at 10:17 AM, Teddy Rusli  wrote:

> Congrats Siyuan!
>
> On Wed, Jun 15, 2016 at 9:28 PM, Ashwin Chandra Putta <
> ashwinchand...@gmail.com> wrote:
>
> > Congratulations Siyuan!!
> > On Jun 15, 2016 9:26 PM, "Thomas Weise"  wrote:
> >
> > > The Apache Apex PMC is pleased to announce that Siyuan Hua is now a PMC
> > > member. We appreciate all his contributions to the project so far, and
> > are
> > > looking forward to more.
> > >
> > > Welcome, Siyuan, and congratulations!
> > > Thomas, for the Apache Apex PMC.
> > >
> >
>
>
>
> --
> Regards,
>
> Teddy Rusli
>



-- 
*regards,*
*~pradeep*


Re: [ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-15 Thread Priyanka Gugale
Congrats Siyuan :)

-Priyanka

On Thu, Jun 16, 2016 at 10:19 AM, Pradeep Kumbhar 
wrote:

> Congratulations Siyuan!!
>
> On Thu, Jun 16, 2016 at 10:17 AM, Teddy Rusli 
> wrote:
>
> > Congrats Siyuan!
> >
> > On Wed, Jun 15, 2016 at 9:28 PM, Ashwin Chandra Putta <
> > ashwinchand...@gmail.com> wrote:
> >
> > > Congratulations Siyuan!!
> > > On Jun 15, 2016 9:26 PM, "Thomas Weise" 
> wrote:
> > >
> > > > The Apache Apex PMC is pleased to announce that Siyuan Hua is now a
> PMC
> > > > member. We appreciate all his contributions to the project so far,
> and
> > > are
> > > > looking forward to more.
> > > >
> > > > Welcome, Siyuan, and congratulations!
> > > > Thomas, for the Apache Apex PMC.
> > > >
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Teddy Rusli
> >
>
>
>
> --
> *regards,*
> *~pradeep*
>


Purging the checkpoints from the StreamingContainers

2016-06-15 Thread Sandesh Hegde
Hello Team,

Purging of the Checkpoints is done in Stram. Why not do that from the
StreamingContainers?

Committed window information is already available in StreamingContainers
and it will also distribute the computation across the containers.

Corner cases can still be handled in Stram. Example: Dynamic partitioning.

Thanks


[jira] [Commented] (APEXCORE-470) New Api for setting the attribute on the operator ( setOperatorAttribute )

2016-06-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15332479#comment-15332479
 ] 

ASF GitHub Bot commented on APEXCORE-470:
-

Github user sandeshh commented on a diff in the pull request:

https://github.com/apache/apex-core/pull/349#discussion_r67238893
  
--- Diff: api/src/main/java/com/datatorrent/api/DAG.java ---
@@ -236,11 +236,22 @@
   public abstract  void setAttribute(Attribute key, T value);
 
   /**
-   * setAttribute.
+   *
+   * Use {@link #setOperatorAttribute} instead
*/
--- End diff --

Addressed all the review comments. Travis builds fails randomly. I just 
updated JavaDoc, so it shouldn't have updated the Travis.


> New Api for setting the attribute on the operator ( setOperatorAttribute )
> --
>
> Key: APEXCORE-470
> URL: https://issues.apache.org/jira/browse/APEXCORE-470
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Sandesh
>Assignee: Sandesh
>
> Currently, *setAttribute* is used to set the operator attributes. Other 2 
> Attribute setting APIs are specific to input ports (*setInputPortAttributes*) 
> and output ports (*setOutputPortsAttributes*).
> Proposal is to have *SetOperatorAttribute* api, which will clearly indicate 
> that user wants set attributes on the operator.
> ( setOperatorAttribute(Operator operator, Attribute key, T value) )
> Following will be the roles for the APIs
> *setAttributes* --> for setting Attributes for the whole DAG (  
> setAttribute(Operator operator, Attribute key, T value) - can be 
> deprecated )
> *setOperatorAttributes* --> for setting Attributes for the operator 
> All the unit test cases using the previous API will be renamed as a part of 
> this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #319: REVIEW ONLY: Operator supporting the Beam con...

2016-06-15 Thread davidyan74
GitHub user davidyan74 opened a pull request:

https://github.com/apache/apex-malhar/pull/319

REVIEW ONLY: Operator supporting the Beam concepts of windowing, 
watermarks, triggering and accumulation

This review-only PR contains the interfaces and a first rough draft of the 
implementation of the operator that supports the concepts of windowing, 
watermarks, triggering and accumulation mode specified by the Apache Beam API. 

Note that this is a rough first draft and it's not supposed to be merged.  
When you review this PR, please:
- ignore code style, missing copyright, missing javadoc, missing newline at 
EOF, etc. for now
- focus only on java files in the packages 
org.apache.apex.malhar.stream.window and 
org.apache.apex.malhar.stream.window.impl for now. if you have comments on 
changes outside of these two packages, please do so here: 
https://github.com/apache/apex-malhar/pull/316
- focus only on the semantics of those java files
- note the TODO comments

thank you.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/davidyan74/apex-malhar windowedOperator

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/319.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #319


commit 45d56b474567e24205c5e47db4bca3cce9ea5485
Author: Siyuan Hua 
Date:   2016-06-09T01:06:40Z

High-level WindowedStream API

commit 0de0587d226cf1a0d96b3c0ed9de8ccb325537aa
Author: David Yan 
Date:   2016-06-13T19:13:10Z

APEXMALHAR-2085 First draft of windowed operator interfaces

commit e1460f784391d4e01a1c3cbb0d49bb5aa6f60dd3
Author: David Yan 
Date:   2016-06-14T19:44:26Z

first try of implementation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (APEXMALHAR-2120) Fix bugs on KafkaInputOperatorTest AbstractKafkaInputOperator

2016-06-15 Thread bright chen (JIRA)
bright chen created APEXMALHAR-2120:
---

 Summary: Fix bugs on KafkaInputOperatorTest 
AbstractKafkaInputOperator
 Key: APEXMALHAR-2120
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2120
 Project: Apache Apex Malhar
  Issue Type: Bug
Affects Versions: 3.4.0
Reporter: bright chen
Assignee: bright chen
 Fix For: 3.5.0


problems in Unit Test class: KafkaInputOperatorTest
- 'k' not initialized for each test case
- The assert was not correct
- The test case assume the END_TUPLE will be received at the end of normal 
tuples, but in fact the tuples could be out of order where support multiple 
cluster or partition
- The operator AbstractKafkaInputOperator implemented as "at least once", but 
the test case assume "exactly once"


problem of AbstractKafkaInputOperator:
For test case KafkaInputOperatorTest.testIdempotentInputOperatorWithFailure() 
with senario
{true, true, "one_to_many"}
("multi-cluster: true, multi-partition: true, partition: "one_to_many") throws 
following exception and the Collector Module didn't collect any data.
2016-06-15 10:43:56,358 [1/Kafka inputtesttopic0:KafkaSinglePortInputOperator] 
INFO stram.StramLocalCluster log - container-6 msg: Stopped running due to an 
exception. java.lang.RuntimeException: Couldn't replay the offset
at 
org.apache.apex.malhar.kafka.KafkaConsumerWrapper.emitImmediately(KafkaConsumerWrapper.java:146)
at 
org.apache.apex.malhar.kafka.AbstractKafkaInputOperator.replay(AbstractKafkaInputOperator.java:261)
at 
org.apache.apex.malhar.kafka.AbstractKafkaInputOperator.beginWindow(AbstractKafkaInputOperator.java:250)
at com.datatorrent.stram.engine.InputNode.run(InputNode.java:122)
at 
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407)
Caused by: org.apache.kafka.clients.consumer.NoOffsetForPartitionException: 
Undefined offset with no reset policy for partition: testtopic0-1
at 
org.apache.kafka.clients.consumer.internals.Fetcher.resetOffset(Fetcher.java:288)
at 
org.apache.kafka.clients.consumer.internals.Fetcher.updateFetchPositions(Fetcher.java:167)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.updateFetchPositions(KafkaConsumer.java:1302)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:895)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-core pull request #349: APEXCORE-470 New API in DAG - setOperatorAttrib...

2016-06-15 Thread sandeshh
Github user sandeshh commented on a diff in the pull request:

https://github.com/apache/apex-core/pull/349#discussion_r67238893
  
--- Diff: api/src/main/java/com/datatorrent/api/DAG.java ---
@@ -236,11 +236,22 @@
   public abstract  void setAttribute(Attribute key, T value);
 
   /**
-   * setAttribute.
+   *
+   * Use {@link #setOperatorAttribute} instead
*/
--- End diff --

Addressed all the review comments. Travis builds fails randomly. I just 
updated JavaDoc, so it shouldn't have updated the Travis.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-malhar pull request #320: APEXMALHAR-2120 #resolve #comment solve probl...

2016-06-15 Thread brightchen
GitHub user brightchen opened a pull request:

https://github.com/apache/apex-malhar/pull/320

APEXMALHAR-2120 #resolve #comment solve problems of KafkaInputOperato…

…rTest and AbstractKafkaInputOperator

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2120

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/320.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #320


commit 7055fa2ddcd931b18e75bbb96b6e97f24f1f5889
Author: brightchen 
Date:   2016-06-14T23:30:17Z

APEXMALHAR-2120 #resolve #comment solve problems of KafkaInputOperatorTest 
and AbstractKafkaInputOperator




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Parquet Writer Operator

2016-06-15 Thread Thomas Weise
Hi Dev,

Can you not use the existing WAL implementation (via WindowDataManager or
directly)?

Thomas


On Wed, Jun 15, 2016 at 3:47 PM, Devendra Tagare 
wrote:

> Hi,
>
> Initial thoughts were to go for a WAL based approach where the operator
> would first write POJO's to the WAL and then a separate thread would do the
> task of reading from the WAL and writing the destination files based on the
> block size.
>
> There is a ticket open for a pluggable spooling implementation with output
> operators which can be leveraged for this,
> https://issues.apache.org/jira/browse/APEXMALHAR-2037
>
> Since work is already being done on that front, we can plug in the spooler
> with the existing implementation of the ParquetFileWriter at that point and
> remove the first operator - ParquetFileOutputOperator.
>
> Thanks,
> Dev
>
> On Tue, Jun 14, 2016 at 7:21 PM, Thomas Weise 
> wrote:
>
> > What's the reason for not considering the WAL based approach?
> >
> > What are the pros and cons?
> >
> >
> > On Tue, Jun 14, 2016 at 6:54 PM, Devendra Tagare <
> > devend...@datatorrent.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > We can focus on the below 2 problems,
> > > 1.Avoid the small files problem which could arise due a flush at every
> > > endWindow, since there wouldn't be significant data in a window.
> > > 2.Fault Tolerance.
> > >
> > > *Proposal* : Create a module in which there are 2 operators,
> > >
> > > *Operator 1 : ParquetFileOutputOperator*
> > > This operator will be an implementation of the
> > AbstractFileOutputOperator.
> > > It will write data to a HDFS location and leverage the fault-tolerance
> > > semantics of the AbstractFileOutputOperator.
> > >
> > > This operator will implement the CheckpointNotificationListener and
> will
> > > emit the finalizedFiles from the beforeCheckpoint method.
> > > Map
> > >
> > > *Operator 2 : ParquetFileWriter*
> > > This operator will receive a Set from
> the
> > > ParquetFileOutputOperator on its input port.
> > > Once it receives this map, it will do the below things,
> > >
> > > 1.Save the input received to a Map
> > inputFilesMap
> > >
> > > 2.Instantiate a new ParquetWriter
> > >   2.a. Get a unique file name.
> > >   2.b. Add a configurable writer that extends the ParquetWriter and
> > include
> > > a write support for writing various supported formats like Avro,thrift
> > etc.
> > >
> > > 3.For each file from the inputFilesMap,
> > >   3.a Read the file and write the record using the writer created in
> (2)
> > >   3.b Check if the block size (configurable) is reached.If yes then
> close
> > > the file and add its entry to a
> > > MapcompletedFilesMap.Remove the entry from
> > > inputFilesMap.
> > > If the writes fail then the files can be reprocessed from the
> > > inputFilesMap.
> > > 3.c In the committed callback remove the completed files from the
> > directory
> > > and prune the completedFilesMap for that window.
> > >
> > > Points to note,
> > > 1.The block size check will be approximate since the data is in memory
> > and
> > > ParquetWriter does not expose a flush.
> > > 2.This is at best a temporary implementation in the absence of a WAL
> > based
> > > approach.
> > >
> > > I would like to take a crack at this operator based on community
> > feedback.
> > >
> > > Thoughts ?
> > >
> > > Thanks,
> > > Dev
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Apr 25, 2016 at 12:36 PM, Tushar Gosavi <
> tus...@datatorrent.com>
> > > wrote:
> > >
> > > > Hi Shubham,
> > > >
> > > > +1 for the Parquet  writer.
> > > >
> > > > I doubt if we could leverage on recovery mechanism provided by
> > > > AbstractFileOutputOperator as Parquet Writer does not expose flush,
> and
> > > > could write to underline stream at any time. To simplify recovery you
> > can
> > > > write a single file in each checkpoint duration. If this is not an
> > > option,
> > > > then
> > > > you need to make use of WAL for recovery, and not use operator
> > > > check-pointing for storing not persisted tuples, as checkpointing
> huge
> > > > state every 30 seconds is costly.
> > > >
> > > > Regards,
> > > > -Tushar.
> > > >
> > > >
> > > > On Mon, Apr 25, 2016 at 11:38 PM, Shubham Pathak <
> > > shub...@datatorrent.com>
> > > > wrote:
> > > >
> > > > > Hello Community,
> > > > >
> > > > > Apache Parquet 
> > is a
> > > > > columnar oriented binary file format designed to be extremely
> > efficient
> > > > and
> > > > > interoperable across Hadoop ecosystem. It has integrations with
> most
> > of
> > > > the
> > > > > Hadoop processing frameworks ( Impala, Hive, Pig, Spark.. ) and
> > > > > serialization models (Thrift, Avro, Protobuf)  making it easy to
> use
> > in
> > > > ETL
> > > > > and processing pipelines.
> > > > >
> > > > > Having an operator to write data to Parquet files would certainly

[GitHub] apex-malhar pull request #320: APEXMALHAR-2120 #resolve #comment solve probl...

2016-06-15 Thread amberarrow
Github user amberarrow commented on a diff in the pull request:

https://github.com/apache/apex-malhar/pull/320#discussion_r67269717
  
--- Diff: 
kafka/src/test/java/org/apache/apex/malhar/kafka/KafkaInputOperatorTest.java ---
@@ -120,8 +127,14 @@ public KafkaInputOperatorTest(boolean hasMultiCluster, 
boolean hasMultiPartition
*/
   public static class CollectorModule extends BaseOperator
   {
-
-public final transient CollectorInputPort inputPort = new 
CollectorInputPort(this);
+public final transient DefaultInputPort inputPort = new 
DefaultInputPort()
+{
--- End diff --

indentation is odd


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2120) Fix bugs on KafkaInputOperatorTest AbstractKafkaInputOperator

2016-06-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333107#comment-15333107
 ] 

ASF GitHub Bot commented on APEXMALHAR-2120:


GitHub user brightchen reopened a pull request:

https://github.com/apache/apex-malhar/pull/320

APEXMALHAR-2120 #resolve #comment solve problems of KafkaInputOperato…

…rTest and AbstractKafkaInputOperator

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2120

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/320.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #320


commit 6f99fb20da7214f493b442ce4ba3f447ec51cd13
Author: brightchen 
Date:   2016-06-14T23:30:17Z

APEXMALHAR-2120 #resolve #comment solve problems of KafkaInputOperatorTest 
and AbstractKafkaInputOperator




> Fix bugs on KafkaInputOperatorTest AbstractKafkaInputOperator
> -
>
> Key: APEXMALHAR-2120
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2120
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: bright chen
>Assignee: bright chen
> Fix For: 3.5.0
>
>
> problems in Unit Test class: KafkaInputOperatorTest
> - 'k' not initialized for each test case
> - The assert was not correct
> - The test case assume the END_TUPLE will be received at the end of normal 
> tuples, but in fact the tuples could be out of order where support multiple 
> cluster or partition
> - The operator AbstractKafkaInputOperator implemented as "at least once", but 
> the test case assume "exactly once"
> problem of AbstractKafkaInputOperator:
> For test case KafkaInputOperatorTest.testIdempotentInputOperatorWithFailure() 
> with senario
> {true, true, "one_to_many"}
> ("multi-cluster: true, multi-partition: true, partition: "one_to_many") 
> throws following exception and the Collector Module didn't collect any data.
> 2016-06-15 10:43:56,358 [1/Kafka 
> inputtesttopic0:KafkaSinglePortInputOperator] INFO stram.StramLocalCluster 
> log - container-6 msg: Stopped running due to an exception. 
> java.lang.RuntimeException: Couldn't replay the offset
> at 
> org.apache.apex.malhar.kafka.KafkaConsumerWrapper.emitImmediately(KafkaConsumerWrapper.java:146)
> at 
> org.apache.apex.malhar.kafka.AbstractKafkaInputOperator.replay(AbstractKafkaInputOperator.java:261)
> at 
> org.apache.apex.malhar.kafka.AbstractKafkaInputOperator.beginWindow(AbstractKafkaInputOperator.java:250)
> at com.datatorrent.stram.engine.InputNode.run(InputNode.java:122)
> at 
> com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407)
> Caused by: org.apache.kafka.clients.consumer.NoOffsetForPartitionException: 
> Undefined offset with no reset policy for partition: testtopic0-1
> at 
> org.apache.kafka.clients.consumer.internals.Fetcher.resetOffset(Fetcher.java:288)
> at 
> org.apache.kafka.clients.consumer.internals.Fetcher.updateFetchPositions(Fetcher.java:167)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.updateFetchPositions(KafkaConsumer.java:1302)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:895)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #320: APEXMALHAR-2120 #resolve #comment solve probl...

2016-06-15 Thread brightchen
GitHub user brightchen reopened a pull request:

https://github.com/apache/apex-malhar/pull/320

APEXMALHAR-2120 #resolve #comment solve problems of KafkaInputOperato…

…rTest and AbstractKafkaInputOperator

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2120

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/320.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #320


commit 6f99fb20da7214f493b442ce4ba3f447ec51cd13
Author: brightchen 
Date:   2016-06-14T23:30:17Z

APEXMALHAR-2120 #resolve #comment solve problems of KafkaInputOperatorTest 
and AbstractKafkaInputOperator




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-malhar pull request #320: APEXMALHAR-2120 #resolve #comment solve probl...

2016-06-15 Thread brightchen
Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/320


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-15 Thread Aniruddha Thombare
Congratulations!!!

Thanks,

A

_
Sent with difficulty, I mean handheld ;)
On 16 Jun 2016 10:47 am, "Devendra Tagare" 
wrote:

> Congratulations Siyuan
>
> Cheers,
> Dev
> On Jun 15, 2016 10:13 PM, "Chinmay Kolhatkar" 
> wrote:
>
> > Congrats Siyuan :)
> >
> > On Wed, Jun 15, 2016 at 10:05 PM, Priyanka Gugale 
> > wrote:
> >
> > > Congrats Siyuan :)
> > >
> > > -Priyanka
> > >
> > > On Thu, Jun 16, 2016 at 10:19 AM, Pradeep Kumbhar <
> > prad...@datatorrent.com
> > > >
> > > wrote:
> > >
> > > > Congratulations Siyuan!!
> > > >
> > > > On Thu, Jun 16, 2016 at 10:17 AM, Teddy Rusli  >
> > > > wrote:
> > > >
> > > > > Congrats Siyuan!
> > > > >
> > > > > On Wed, Jun 15, 2016 at 9:28 PM, Ashwin Chandra Putta <
> > > > > ashwinchand...@gmail.com> wrote:
> > > > >
> > > > > > Congratulations Siyuan!!
> > > > > > On Jun 15, 2016 9:26 PM, "Thomas Weise" 
> > > > wrote:
> > > > > >
> > > > > > > The Apache Apex PMC is pleased to announce that Siyuan Hua is
> > now a
> > > > PMC
> > > > > > > member. We appreciate all his contributions to the project so
> > far,
> > > > and
> > > > > > are
> > > > > > > looking forward to more.
> > > > > > >
> > > > > > > Welcome, Siyuan, and congratulations!
> > > > > > > Thomas, for the Apache Apex PMC.
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > >
> > > > > Teddy Rusli
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *regards,*
> > > > *~pradeep*
> > > >
> > >
> >
>


[jira] [Commented] (APEXMALHAR-2120) Fix bugs on KafkaInputOperatorTest AbstractKafkaInputOperator

2016-06-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15333106#comment-15333106
 ] 

ASF GitHub Bot commented on APEXMALHAR-2120:


Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/320


> Fix bugs on KafkaInputOperatorTest AbstractKafkaInputOperator
> -
>
> Key: APEXMALHAR-2120
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2120
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: bright chen
>Assignee: bright chen
> Fix For: 3.5.0
>
>
> problems in Unit Test class: KafkaInputOperatorTest
> - 'k' not initialized for each test case
> - The assert was not correct
> - The test case assume the END_TUPLE will be received at the end of normal 
> tuples, but in fact the tuples could be out of order where support multiple 
> cluster or partition
> - The operator AbstractKafkaInputOperator implemented as "at least once", but 
> the test case assume "exactly once"
> problem of AbstractKafkaInputOperator:
> For test case KafkaInputOperatorTest.testIdempotentInputOperatorWithFailure() 
> with senario
> {true, true, "one_to_many"}
> ("multi-cluster: true, multi-partition: true, partition: "one_to_many") 
> throws following exception and the Collector Module didn't collect any data.
> 2016-06-15 10:43:56,358 [1/Kafka 
> inputtesttopic0:KafkaSinglePortInputOperator] INFO stram.StramLocalCluster 
> log - container-6 msg: Stopped running due to an exception. 
> java.lang.RuntimeException: Couldn't replay the offset
> at 
> org.apache.apex.malhar.kafka.KafkaConsumerWrapper.emitImmediately(KafkaConsumerWrapper.java:146)
> at 
> org.apache.apex.malhar.kafka.AbstractKafkaInputOperator.replay(AbstractKafkaInputOperator.java:261)
> at 
> org.apache.apex.malhar.kafka.AbstractKafkaInputOperator.beginWindow(AbstractKafkaInputOperator.java:250)
> at com.datatorrent.stram.engine.InputNode.run(InputNode.java:122)
> at 
> com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1407)
> Caused by: org.apache.kafka.clients.consumer.NoOffsetForPartitionException: 
> Undefined offset with no reset policy for partition: testtopic0-1
> at 
> org.apache.kafka.clients.consumer.internals.Fetcher.resetOffset(Fetcher.java:288)
> at 
> org.apache.kafka.clients.consumer.internals.Fetcher.updateFetchPositions(Fetcher.java:167)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.updateFetchPositions(KafkaConsumer.java:1302)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:895)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-15 Thread Devendra Tagare
Congratulations Siyuan

Cheers,
Dev
On Jun 15, 2016 10:13 PM, "Chinmay Kolhatkar" 
wrote:

> Congrats Siyuan :)
>
> On Wed, Jun 15, 2016 at 10:05 PM, Priyanka Gugale 
> wrote:
>
> > Congrats Siyuan :)
> >
> > -Priyanka
> >
> > On Thu, Jun 16, 2016 at 10:19 AM, Pradeep Kumbhar <
> prad...@datatorrent.com
> > >
> > wrote:
> >
> > > Congratulations Siyuan!!
> > >
> > > On Thu, Jun 16, 2016 at 10:17 AM, Teddy Rusli 
> > > wrote:
> > >
> > > > Congrats Siyuan!
> > > >
> > > > On Wed, Jun 15, 2016 at 9:28 PM, Ashwin Chandra Putta <
> > > > ashwinchand...@gmail.com> wrote:
> > > >
> > > > > Congratulations Siyuan!!
> > > > > On Jun 15, 2016 9:26 PM, "Thomas Weise" 
> > > wrote:
> > > > >
> > > > > > The Apache Apex PMC is pleased to announce that Siyuan Hua is
> now a
> > > PMC
> > > > > > member. We appreciate all his contributions to the project so
> far,
> > > and
> > > > > are
> > > > > > looking forward to more.
> > > > > >
> > > > > > Welcome, Siyuan, and congratulations!
> > > > > > Thomas, for the Apache Apex PMC.
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > >
> > > > Teddy Rusli
> > > >
> > >
> > >
> > >
> > > --
> > > *regards,*
> > > *~pradeep*
> > >
> >
>


Re: [ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-15 Thread Sandeep Deshmukh
Congrats Siyuan !!!

Regards,
Sandeep

On Thu, Jun 16, 2016 at 10:49 AM, Aniruddha Thombare <
anirud...@datatorrent.com> wrote:

> Congratulations!!!
>
> Thanks,
>
> A
>
> _
> Sent with difficulty, I mean handheld ;)
> On 16 Jun 2016 10:47 am, "Devendra Tagare" 
> wrote:
>
> > Congratulations Siyuan
> >
> > Cheers,
> > Dev
> > On Jun 15, 2016 10:13 PM, "Chinmay Kolhatkar" 
> > wrote:
> >
> > > Congrats Siyuan :)
> > >
> > > On Wed, Jun 15, 2016 at 10:05 PM, Priyanka Gugale 
> > > wrote:
> > >
> > > > Congrats Siyuan :)
> > > >
> > > > -Priyanka
> > > >
> > > > On Thu, Jun 16, 2016 at 10:19 AM, Pradeep Kumbhar <
> > > prad...@datatorrent.com
> > > > >
> > > > wrote:
> > > >
> > > > > Congratulations Siyuan!!
> > > > >
> > > > > On Thu, Jun 16, 2016 at 10:17 AM, Teddy Rusli <
> te...@datatorrent.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Congrats Siyuan!
> > > > > >
> > > > > > On Wed, Jun 15, 2016 at 9:28 PM, Ashwin Chandra Putta <
> > > > > > ashwinchand...@gmail.com> wrote:
> > > > > >
> > > > > > > Congratulations Siyuan!!
> > > > > > > On Jun 15, 2016 9:26 PM, "Thomas Weise" <
> tho...@datatorrent.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > The Apache Apex PMC is pleased to announce that Siyuan Hua is
> > > now a
> > > > > PMC
> > > > > > > > member. We appreciate all his contributions to the project so
> > > far,
> > > > > and
> > > > > > > are
> > > > > > > > looking forward to more.
> > > > > > > >
> > > > > > > > Welcome, Siyuan, and congratulations!
> > > > > > > > Thomas, for the Apache Apex PMC.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > >
> > > > > > Teddy Rusli
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *regards,*
> > > > > *~pradeep*
> > > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-15 Thread Bhupesh Chawda
Congratulations Siyuan!!

~ Bhupesh

On Wed, Jun 15, 2016 at 10:19 PM, Sandeep Deshmukh 
wrote:

> Congrats Siyuan !!!
>
> Regards,
> Sandeep
>
> On Thu, Jun 16, 2016 at 10:49 AM, Aniruddha Thombare <
> anirud...@datatorrent.com> wrote:
>
> > Congratulations!!!
> >
> > Thanks,
> >
> > A
> >
> > _
> > Sent with difficulty, I mean handheld ;)
> > On 16 Jun 2016 10:47 am, "Devendra Tagare" 
> > wrote:
> >
> > > Congratulations Siyuan
> > >
> > > Cheers,
> > > Dev
> > > On Jun 15, 2016 10:13 PM, "Chinmay Kolhatkar"  >
> > > wrote:
> > >
> > > > Congrats Siyuan :)
> > > >
> > > > On Wed, Jun 15, 2016 at 10:05 PM, Priyanka Gugale  >
> > > > wrote:
> > > >
> > > > > Congrats Siyuan :)
> > > > >
> > > > > -Priyanka
> > > > >
> > > > > On Thu, Jun 16, 2016 at 10:19 AM, Pradeep Kumbhar <
> > > > prad...@datatorrent.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Congratulations Siyuan!!
> > > > > >
> > > > > > On Thu, Jun 16, 2016 at 10:17 AM, Teddy Rusli <
> > te...@datatorrent.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Congrats Siyuan!
> > > > > > >
> > > > > > > On Wed, Jun 15, 2016 at 9:28 PM, Ashwin Chandra Putta <
> > > > > > > ashwinchand...@gmail.com> wrote:
> > > > > > >
> > > > > > > > Congratulations Siyuan!!
> > > > > > > > On Jun 15, 2016 9:26 PM, "Thomas Weise" <
> > tho...@datatorrent.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > The Apache Apex PMC is pleased to announce that Siyuan Hua
> is
> > > > now a
> > > > > > PMC
> > > > > > > > > member. We appreciate all his contributions to the project
> so
> > > > far,
> > > > > > and
> > > > > > > > are
> > > > > > > > > looking forward to more.
> > > > > > > > >
> > > > > > > > > Welcome, Siyuan, and congratulations!
> > > > > > > > > Thomas, for the Apache Apex PMC.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Regards,
> > > > > > >
> > > > > > > Teddy Rusli
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *regards,*
> > > > > > *~pradeep*
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-15 Thread Ashish Tadose
Congratulations Siyuan.

Ashish

> On 16-Jun-2016, at 10:49 AM, Aniruddha Thombare  
> wrote:
> 
> Congratulations!!!
> 
> Thanks,
> 
> A
> 
> _
> Sent with difficulty, I mean handheld ;)
> On 16 Jun 2016 10:47 am, "Devendra Tagare" 
> wrote:
> 
>> Congratulations Siyuan
>> 
>> Cheers,
>> Dev
>> On Jun 15, 2016 10:13 PM, "Chinmay Kolhatkar" 
>> wrote:
>> 
>>> Congrats Siyuan :)
>>> 
>>> On Wed, Jun 15, 2016 at 10:05 PM, Priyanka Gugale 
>>> wrote:
>>> 
 Congrats Siyuan :)
 
 -Priyanka
 
 On Thu, Jun 16, 2016 at 10:19 AM, Pradeep Kumbhar <
>>> prad...@datatorrent.com
> 
 wrote:
 
> Congratulations Siyuan!!
> 
> On Thu, Jun 16, 2016 at 10:17 AM, Teddy Rusli >> 
> wrote:
> 
>> Congrats Siyuan!
>> 
>> On Wed, Jun 15, 2016 at 9:28 PM, Ashwin Chandra Putta <
>> ashwinchand...@gmail.com> wrote:
>> 
>>> Congratulations Siyuan!!
>>> On Jun 15, 2016 9:26 PM, "Thomas Weise" 
> wrote:
>>> 
 The Apache Apex PMC is pleased to announce that Siyuan Hua is
>>> now a
> PMC
 member. We appreciate all his contributions to the project so
>>> far,
> and
>>> are
 looking forward to more.
 
 Welcome, Siyuan, and congratulations!
 Thomas, for the Apache Apex PMC.
 
>>> 
>> 
>> 
>> 
>> --
>> Regards,
>> 
>> Teddy Rusli
>> 
> 
> 
> 
> --
> *regards,*
> *~pradeep*
> 
 
>>> 
>>