[jira] [Updated] (APEXMALHAR-2532) Transform Application Test flooding CI logs

2017-08-01 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2532:
---
Labels: newbie  (was: )

> Transform Application Test flooding CI logs
> ---
>
> Key: APEXMALHAR-2532
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2532
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Thomas Weise
>  Labels: newbie
>
> Test should assert results and not print them.
> {code}
> ---
>  T E S T S
> ---
> Running org.apache.apex.examples.transform.ApplicationTest
> 2017-08-01 13:32:09,370 [main] WARN  util.NativeCodeLoader  - Unable 
> to load native-hadoop library for your platform... using builtin-java classes 
> where applicable
> 2017-08-01 13:32:10,070 [master] WARN  stram.Journal write - Journal output 
> stream is null. Skipping write to the WAL.
> 2017-08-01 13:32:10,074 [master] WARN  stram.Journal write - Journal output 
> stream is null. Skipping write to the WAL.
> 2017-08-01 13:32:10,075 [master] WARN  stram.Journal write - Journal output 
> stream is null. Skipping write to the WAL.
> 2017-08-01 13:32:10,076 [master] WARN  stram.Journal write - Journal output 
> stream is null. Skipping write to the WAL.
> CustomerInfo{customerId=91459, name='Ldb dKzm', age=77, address='xxd'}
> CustomerInfo{customerId=19131, name='qpjaSoQfxF TFlKPemU', age=86, 
> address='myrbhszyudll'}
> CustomerInfo{customerId=20122, name='YOEFzMreda oqSU', age=15, address='rq'}
> CustomerInfo{customerId=75939, name='sB Eha', age=48, address='czkuzvzlh'}
> CustomerInfo{customerId=54846, name='zRL Ss', age=11, address='hmx'}
> CustomerInfo{customerId=22583, name='WcV V', age=58, address='mw'}
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (APEXCORE-602) Provide a "group-id" in the event object so that events are grouped together by a "root cause".

2017-06-28 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXCORE-602.
-
   Resolution: Fixed
Fix Version/s: 3.7.0

> Provide a "group-id" in the event object so that events are grouped together 
> by a "root cause".
> ---
>
> Key: APEXCORE-602
> URL: https://issues.apache.org/jira/browse/APEXCORE-602
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Sanjay M Pujare
>Assignee: Priyanka Gugale
> Fix For: 3.7.0
>
>
> Provide a "group-id" in the event object so that events are grouped together 
> by a "root cause". An example is a bunch of container restarts are related to 
> a single failure in the application but the current sequence of Stram events 
> doesn't make it obvious. The consumer of events is able to better 
> read/analyze the events because of the group-id and focus on the root-cause.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (APEXMALHAR-2507) Example for inner join functionality using Windowed merge operator

2017-06-19 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16053535#comment-16053535
 ] 

Bhupesh Chawda commented on APEXMALHAR-2507:


I agree. The current example need not be removed. It should be modified to use 
windowed operator.

> Example for inner join functionality using Windowed merge operator
> --
>
> Key: APEXMALHAR-2507
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2507
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Bhupesh Chawda
>Assignee: Shunxin Lu
>
> With removal of the inner join operator (based on managed state), we need to 
> present windowed merge operator as the means of doing an inner join. An 
> example showing how to do an inner join using the windowed merge operator is 
> needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (APEXMALHAR-2366) Apply BloomFilter to Bucket

2017-06-07 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2366.

   Resolution: Done
Fix Version/s: 3.8.0

> Apply BloomFilter to Bucket
> ---
>
> Key: APEXMALHAR-2366
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2366
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: bright chen
>Assignee: bright chen
> Fix For: 3.8.0
>
>   Original Estimate: 192h
>  Remaining Estimate: 192h
>
> The bucket get() will check the cache and then check from the stored files if 
> the entry is not in the cache. The checking from files is a pretty heavy 
> operation due to file seek.
> The chance of check from file is very high if the key range are large.
> Suggest to apply BloomFilter for bucket to reduce the chance read from file.
> If the buckets were managed by ManagedStateImpl, the entry of bucket would be 
> very huge and the BloomFilter maybe not useful after a while. But If the 
> buckets were managed by ManagedTimeUnifiedStateImpl, each bucket keep certain 
> amount of entry and BloomFilter would be very useful.
> For implementation:
> The Guava already have BloomFilter and the interface are pretty simple and 
> fit for our case. But Guava 11 is not compatible with Guava 14 (Guava 11 use 
> Sink while Guava 14 use PrimitiveSink).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (APEXMALHAR-2507) Example for inner join functionality using Windowed merge operator

2017-06-06 Thread Bhupesh Chawda (JIRA)
Bhupesh Chawda created APEXMALHAR-2507:
--

 Summary: Example for inner join functionality using Windowed merge 
operator
 Key: APEXMALHAR-2507
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2507
 Project: Apache Apex Malhar
  Issue Type: Improvement
Reporter: Bhupesh Chawda


With removal of the inner join operator (based on managed state), we need to 
present windowed merge operator as the means of doing an inner join. An example 
showing how to do an inner join using the windowed merge operator is needed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (APEXMALHAR-2503) CI Flake: org.apache.apex.malhar.lib.join.POJOPartitionJoinOperatorTest

2017-05-28 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16028012#comment-16028012
 ] 

Bhupesh Chawda commented on APEXMALHAR-2503:


Please see my comment on https://issues.apache.org/jira/browse/APEXMALHAR-2488 
about deprecating the inner join operator.
Would it make sense to comment this test out, given that we no more support the 
inner join operator?

> CI Flake: org.apache.apex.malhar.lib.join.POJOPartitionJoinOperatorTest
> ---
>
> Key: APEXMALHAR-2503
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2503
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Thomas Weise
>
> Test failure in Travis:
> Running org.apache.apex.malhar.lib.join.POJOPartitionJoinOperatorTest
> Test does not terminate.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (APEXMALHAR-2488) Simplify join support in Malhar

2017-05-28 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16028010#comment-16028010
 ] 

Bhupesh Chawda commented on APEXMALHAR-2488:


I think we need to deprecate the inner join operator for now.
Even though it is not working right now and is marked evolving, we cannot 
blindly remove it from the current master.

> Simplify join support in Malhar
> ---
>
> Key: APEXMALHAR-2488
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2488
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>
> Currently there are multiple join implementations in Malhar. 
> One is an inner join implementation based on managed state, and another is a 
> merge operator based on the windowed operator and uses spillable data 
> structures.
> This JIRA is to remove the inner join implementation from malhar - 
> org/apache/apex/malhar/lib/join/
> The merge operator also needs to be simplified - perhaps by providing a 
> wrapper around it which can be used without requiring any knowledge of 
> windowing concepts. This will be a separate JIRA.
> Discussions here: 
> https://lists.apache.org/thread.html/34e5b8753a1ea512eb8ae95af03b7646f23124705fe1c07b069a46c5@%3Cdev.apex.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (APEXMALHAR-2488) Simplify join support in Malhar

2017-05-14 Thread Bhupesh Chawda (JIRA)
Bhupesh Chawda created APEXMALHAR-2488:
--

 Summary: Simplify join support in Malhar
 Key: APEXMALHAR-2488
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2488
 Project: Apache Apex Malhar
  Issue Type: Task
Reporter: Bhupesh Chawda
Assignee: Bhupesh Chawda


Currently there are multiple join implementations in Malhar. 
One is an inner join implementation based on managed state, and another is a 
merge operator based on the windowed operator and uses spillable data 
structures.
This JIRA is to remove the inner join implementation from malhar - 
org/apache/apex/malhar/lib/join/

The merge operator also needs to be simplified - perhaps by providing a wrapper 
around it which can be used without requiring any knowledge of windowing 
concepts. This will be a separate JIRA.

Discussions here: 
https://lists.apache.org/thread.html/34e5b8753a1ea512eb8ae95af03b7646f23124705fe1c07b069a46c5@%3Cdev.apex.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (APEXCORE-678) Shutdown of application should start from input nodes

2017-04-16 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXCORE-678.
-
Resolution: Fixed

> Shutdown of application should start from input nodes
> -
>
> Key: APEXCORE-678
> URL: https://issues.apache.org/jira/browse/APEXCORE-678
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
> Fix For: 3.6.0
>
>
> Streaming container calls shutdown() for all nodes instead of just input 
> nodes.
> {code}
>   private void stopInputNodes()
>   {
> for (Entry e : nodes.entrySet()) {
>   Node node = e.getValue();
>   if (node instanceof InputNode) {
> final Thread thread = e.getValue().context.getThread();
> if (thread == null || !thread.isAlive()) {
>   continue;
> }
>   }
>   node.shutdown(true);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (APEXCORE-654) Recovery window is not updated when Delay Operator is used along with Partitioned Operators

2017-04-04 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXCORE-654:

Attachment: ProblemDag.png

> Recovery window is not updated when Delay Operator is used along with 
> Partitioned Operators
> ---
>
> Key: APEXCORE-654
> URL: https://issues.apache.org/jira/browse/APEXCORE-654
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.5.0
> Environment: Hadoop 2.7.2
> Apache Apex 3.5.0
> Apache Apex Malhar 3.6.0
>Reporter: Ambarish Pande
>Assignee: Bhupesh Chawda
>  Labels: DelayOperator
> Attachments: ProblemDag.png
>
>
> Checkpointing is not happening when DefaultDelayOperator is used in a DAG in 
> which some upstream operators are Partitioned.
> When used without partitioning, I can see the operators being check-pointed 
> properly.
> Here is the link of the App source code and also the built apa file.
> https://github.com/ambarishpande/delay-operator-test



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (APEXCORE-503) support KillException

2017-03-28 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15944838#comment-15944838
 ] 

Bhupesh Chawda commented on APEXCORE-503:
-

The graceful + abort shutdown for a dag has been implemented already as part of 
https://issues.apache.org/jira/browse/APEXCORE-294.
I suggest we use this ticket to allow shutdown of the DAG via any operator. 
This can be done using a `DagShutdownException` via an operator. The exception 
would be caught by the Node and sent to the master via a heartbeat. The master 
then uses the graceful / abort shutdown mechanism to shutdown the entire dag.
The advantage of this would be that the shutdown could be triggered by any 
operator in the dag, not just the input operators.
Thoughts?

> support KillException
> -
>
> Key: APEXCORE-503
> URL: https://issues.apache.org/jira/browse/APEXCORE-503
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Sandesh
>
> Current way for Operators to stop the whole app is to use 
> "ShutdownException". But it is considered as a graceful stop. To stop the 
> whole App when an error condition happens, new exception should be supported, 
> called "KillException" or "KillAppException.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (APEXCORE-654) Recovery window is not updated when Delay Operator is used along with Partitioned Operators

2017-03-25 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15942151#comment-15942151
 ] 

Bhupesh Chawda edited comment on APEXCORE-654 at 3/26/17 5:37 AM:
--

The recovery window is not updated in cases like the example attached.
The issue seems to be that when delay is part of a checkpoint group, the 
unifiers are not considered as part of the same group. This is because logical 
operator groups are used to identify strongly connected components and unifiers 
are not part of the logical plan.


was (Author: bhupesh):
The recovery window is not updated in cases like the example attached.

> Recovery window is not updated when Delay Operator is used along with 
> Partitioned Operators
> ---
>
> Key: APEXCORE-654
> URL: https://issues.apache.org/jira/browse/APEXCORE-654
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.5.0
> Environment: Hadoop 2.7.2
> Apache Apex 3.5.0
> Apache Apex Malhar 3.6.0
>Reporter: Ambarish Pande
>Assignee: Bhupesh Chawda
>  Labels: DelayOperator
>
> Checkpointing is not happening when DefaultDelayOperator is used in a DAG in 
> which some upstream operators are Partitioned.
> When used without partitioning, I can see the operators being check-pointed 
> properly.
> Here is the link of the App source code and also the built apa file.
> https://github.com/ambarishpande/delay-operator-test



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (APEXCORE-654) Recovery window is not updated when Delay Operator is used along with Partitioned Operators

2017-03-25 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15942151#comment-15942151
 ] 

Bhupesh Chawda commented on APEXCORE-654:
-

The recovery window is not updated in cases like the example attached.

> Recovery window is not updated when Delay Operator is used along with 
> Partitioned Operators
> ---
>
> Key: APEXCORE-654
> URL: https://issues.apache.org/jira/browse/APEXCORE-654
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.5.0
> Environment: Hadoop 2.7.2
> Apache Apex 3.5.0
> Apache Apex Malhar 3.6.0
>Reporter: Ambarish Pande
>Assignee: Bhupesh Chawda
>  Labels: DelayOperator
>
> Checkpointing is not happening when DefaultDelayOperator is used in a DAG in 
> which some upstream operators are Partitioned.
> When used without partitioning, I can see the operators being check-pointed 
> properly.
> Here is the link of the App source code and also the built apa file.
> https://github.com/ambarishpande/delay-operator-test



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (APEXCORE-654) Recovery window is not updated when Delay Operator is used along with Partitioned Operators

2017-03-25 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda reassigned APEXCORE-654:
---

Assignee: Bhupesh Chawda

> Recovery window is not updated when Delay Operator is used along with 
> Partitioned Operators
> ---
>
> Key: APEXCORE-654
> URL: https://issues.apache.org/jira/browse/APEXCORE-654
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.5.0
> Environment: Hadoop 2.7.2
> Apache Apex 3.5.0
> Apache Apex Malhar 3.6.0
>Reporter: Ambarish Pande
>Assignee: Bhupesh Chawda
>  Labels: DelayOperator
>
> Checkpointing is not happening when DefaultDelayOperator is used in a DAG in 
> which some upstream operators are Partitioned.
> When used without partitioning, I can see the operators being check-pointed 
> properly.
> Here is the link of the App source code and also the built apa file.
> https://github.com/ambarishpande/delay-operator-test



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (APEXCORE-678) Shutdown of application should start from input nodes

2017-03-22 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXCORE-678:

Description: 
Streaming container calls shutdown() for all nodes instead of just input nodes.

{code}
  private void stopInputNodes()
  {
for (Entry e : nodes.entrySet()) {
  Node node = e.getValue();
  if (node instanceof InputNode) {
final Thread thread = e.getValue().context.getThread();
if (thread == null || !thread.isAlive()) {
  continue;
}
  }
  node.shutdown(true);
}
  }
{code}

  was:
Streaming container calls shutdown() for all nodes instead of just input nodes.

```
  private void stopInputNodes()
  {
for (Entry e : nodes.entrySet()) {
  Node node = e.getValue();
  if (node instanceof InputNode) {
final Thread thread = e.getValue().context.getThread();
if (thread == null || !thread.isAlive()) {
  continue;
}
  }
  node.shutdown(true);
}
  }
```


> Shutdown of application should start from input nodes
> -
>
> Key: APEXCORE-678
> URL: https://issues.apache.org/jira/browse/APEXCORE-678
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>
> Streaming container calls shutdown() for all nodes instead of just input 
> nodes.
> {code}
>   private void stopInputNodes()
>   {
> for (Entry e : nodes.entrySet()) {
>   Node node = e.getValue();
>   if (node instanceof InputNode) {
> final Thread thread = e.getValue().context.getThread();
> if (thread == null || !thread.isAlive()) {
>   continue;
> }
>   }
>   node.shutdown(true);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (APEXCORE-678) Shutdown of application should start from input nodes

2017-03-22 Thread Bhupesh Chawda (JIRA)
Bhupesh Chawda created APEXCORE-678:
---

 Summary: Shutdown of application should start from input nodes
 Key: APEXCORE-678
 URL: https://issues.apache.org/jira/browse/APEXCORE-678
 Project: Apache Apex Core
  Issue Type: Bug
Reporter: Bhupesh Chawda
Assignee: Bhupesh Chawda


Streaming container calls shutdown() for all nodes instead of just input nodes.

```
  private void stopInputNodes()
  {
for (Entry e : nodes.entrySet()) {
  Node node = e.getValue();
  if (node instanceof InputNode) {
final Thread thread = e.getValue().context.getThread();
if (thread == null || !thread.isAlive()) {
  continue;
}
  }
  node.shutdown(true);
}
  }
```



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (APEXMALHAR-2451) Batch support for File I/O operators

2017-03-21 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2451:
---
Summary: Batch support for File I/O operators  (was: Batch demarcation for 
File I/O operators)

> Batch support for File I/O operators
> 
>
> Key: APEXMALHAR-2451
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2451
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>
> Adapt the file input and output operators for supporting batch demarcation. 
> This would require definition of batch start and end (Begin file and end of 
> file) tuples.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (APEXMALHAR-2449) Support for batch demarcation in input/output operators

2017-03-17 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2449:
---
Description: 
In order to support the demarcation of multiple batches in an application, we 
need to adapt the I/O operators to be able to send and receive control tuples. 
These control tuples would serve multiple purposes:
1. They can be used as watermarks thereby allowing stateful windowed processing
2. They can also serve as batch demarcation tuples which allows us to separate 
the state for these batches in the intermediate as well as the output operators.

Discussion thread here: 
https://lists.apache.org/thread.html/b4b27a59ecfe506260f694f66d520ba1e906ca5b2f360b07e173039b@%3Cdev.apex.apache.org%3E

This would be worked on once apex-malhar depends on the release of apex-core 
which has control tuple support.

  was:
In order to support the demarcation of multiple batches in an application, we 
need to adapt the I/O operators to be able to send and receive control tuples. 
These control tuples would serve multiple purposes:
1. They can be used as watermarks thereby allowing stateful windowed processing
2. They can also serve as batch demarcation tuples which allows us to separate 
the state for these batches in the intermediate as well as the output operators.

Discussion thread here: 
https://lists.apache.org/thread.html/b4b27a59ecfe506260f694f66d520ba1e906ca5b2f360b07e173039b@%3Cdev.apex.apache.org%3E


> Support for batch demarcation in input/output operators
> ---
>
> Key: APEXMALHAR-2449
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2449
> Project: Apache Apex Malhar
>  Issue Type: New Feature
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>
> In order to support the demarcation of multiple batches in an application, we 
> need to adapt the I/O operators to be able to send and receive control 
> tuples. These control tuples would serve multiple purposes:
> 1. They can be used as watermarks thereby allowing stateful windowed 
> processing
> 2. They can also serve as batch demarcation tuples which allows us to 
> separate the state for these batches in the intermediate as well as the 
> output operators.
> Discussion thread here: 
> https://lists.apache.org/thread.html/b4b27a59ecfe506260f694f66d520ba1e906ca5b2f360b07e173039b@%3Cdev.apex.apache.org%3E
> This would be worked on once apex-malhar depends on the release of apex-core 
> which has control tuple support.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (APEXCORE-575) Improve application relaunch time.

2017-03-14 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924130#comment-15924130
 ] 

Bhupesh Chawda commented on APEXCORE-575:
-

[~tushargosavi] Can you please update the description of the JIRA?

> Improve application relaunch time.
> --
>
> Key: APEXCORE-575
> URL: https://issues.apache.org/jira/browse/APEXCORE-575
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Tushar Gosavi
>Assignee: Tushar Gosavi
>
> Improve application relaunch time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (APEXMALHAR-2408) Issues in correctness of get() for key search in ManagedTimeStateImpl

2017-03-14 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2408:
---
Summary: Issues in correctness of get() for key search in 
ManagedTimeStateImpl  (was: Validate correctness of get() for key search in 
ManagedTimeStateImpl)

> Issues in correctness of get() for key search in ManagedTimeStateImpl
> -
>
> Key: APEXMALHAR-2408
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2408
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: Chaitanya
>Assignee: Chaitanya
>
> Below are the issues:
> 1) When the key searches all time buckets, not checking whether the time 
> bucket is expired or not. 
> 2) CachedbucketMetas not cleared in DefaultBucket.freeMemory().
> 3) If the bucketed data is spill over multiple committed windows, firstKey of 
> BucketMetaData is not changing after the second committed call.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (APEXMALHAR-2406) ManagedState Issues

2017-03-14 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2406:
---
Description: Few bugs encountered while using ManagedState. Specific issues 
described in sub tasks.

> ManagedState Issues
> ---
>
> Key: APEXMALHAR-2406
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2406
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Chaitanya
>Assignee: Chaitanya
>
> Few bugs encountered while using ManagedState. Specific issues described in 
> sub tasks.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (APEXCORE-593) apex cli get-app-package-info could not retrieve properties defined in properties.xml

2017-03-14 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15923801#comment-15923801
 ] 

Bhupesh Chawda commented on APEXCORE-593:
-

[~vikram] Can you please add more details in the description as to what exactly 
the issue is and what are the steps to reproduce it?

> apex cli get-app-package-info could not retrieve properties defined in 
> properties.xml
> -
>
> Key: APEXCORE-593
> URL: https://issues.apache.org/jira/browse/APEXCORE-593
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Yogi Devendra
>Priority: Minor
>
> If application defines properties in properties.xml; such properties should 
> be available to populateDAG() method when invoked from processAppDirectory() 
> in get-app-package-info.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (APEXMALHAR-2350) The key and value stream should match with the bucket

2017-03-10 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2350.

   Resolution: Fixed
Fix Version/s: 3.7.0

> The key and value stream should match with the bucket
> -
>
> Key: APEXMALHAR-2350
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2350
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: bright chen
>Assignee: bright chen
> Fix For: 3.7.0
>
>
> In SpillableMapImpl, the bucket which the data put into will keep on changing 
> instead of a fixed bucket. So the key stream and value stream should match to 
> the bucket



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (APEXCORE-660) Documentation for Control tuple support changes

2017-03-04 Thread Bhupesh Chawda (JIRA)
Bhupesh Chawda created APEXCORE-660:
---

 Summary: Documentation for Control tuple support changes
 Key: APEXCORE-660
 URL: https://issues.apache.org/jira/browse/APEXCORE-660
 Project: Apache Apex Core
  Issue Type: Sub-task
Reporter: Bhupesh Chawda






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (APEXMALHAR-2366) Apply BloomFilter to Bucket

2017-02-23 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15882040#comment-15882040
 ] 

Bhupesh Chawda commented on APEXMALHAR-2366:


Hi [~brightchen]
Sorry for commenting so late. 
A bloom filter implementation is already implemented by [~chaithu] in the Megh 
library. You can see it here: 
https://github.com/DataTorrent/Megh/blob/master/library/src/main/java/com/datatorrent/lib/bucket/bloomFilter/BloomFilter.java

Can you please see if this implementation can be reused? I am asking this 
because the one in Megh is well tested as part of an earlier Deduper 
implementation.

> Apply BloomFilter to Bucket
> ---
>
> Key: APEXMALHAR-2366
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2366
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: bright chen
>Assignee: bright chen
>   Original Estimate: 192h
>  Remaining Estimate: 192h
>
> The bucket get() will check the cache and then check from the stored files if 
> the entry is not in the cache. The checking from files is a pretty heavy 
> operation due to file seek.
> The chance of check from file is very high if the key range are large.
> Suggest to apply BloomFilter for bucket to reduce the chance read from file.
> If the buckets were managed by ManagedStateImpl, the entry of bucket would be 
> very huge and the BloomFilter maybe not useful after a while. But If the 
> buckets were managed by ManagedTimeUnifiedStateImpl, each bucket keep certain 
> amount of entry and BloomFilter would be very useful.
> For implementation:
> The Guava already have BloomFilter and the interface are pretty simple and 
> fit for our case. But Guava 11 is not compatible with Guava 14 (Guava 11 use 
> Sink while Guava 14 use PrimitiveSink).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (APEXMALHAR-2374) Recursive support for AbstractFileInputOperator

2017-01-03 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2374.

   Resolution: Fixed
Fix Version/s: 3.7.0

> Recursive support for AbstractFileInputOperator
> ---
>
> Key: APEXMALHAR-2374
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2374
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Affects Versions: 3.6.0
>Reporter: Francis Fernandes
>Assignee: Francis Fernandes
> Fix For: 3.7.0
>
>
> Add recursive support to AbstractFileInputOperator. Curently it fails with 
> FileNotFoundException when the input directory contains sub directories 
> inside it.
> {code}
> ERROR com.datatorrent.lib.io.fs.AbstractFileInputOperator: FS reader error
> java.io.FileNotFoundException: Path is not a file: /user/dttbc/DATASETS/a/b
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:70)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:559)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
>   at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>   at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
>   at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
>   at 
> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1215)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1203)
>   at 
> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1193)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:299)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:265)
>   at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:257)
>   at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1492)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:302)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:298)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:298)
>   at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766)
>   at 
> com.datatorrent.lib.io.fs.AbstractFileInputOperator.openFile(AbstractFileInputOperator.java:773)
>   at com.example.fileIO.BytesFileReader.openFile(BytesFileReader.java:87)
>   at 
> com.datatorrent.lib.io.fs.AbstractFileInputOperator.emitTuples(AbstractFileInputOperator.java:642)
>   at 
> com.example.fileIO.BytesFileReader.emitTuples(BytesFileReader.java:77)
>   at com.datatorrent.stram.engine.InputNode.run(InputNode.java:91)
> 

[jira] [Commented] (APEXMALHAR-2368) JDBCPollInput operator reads extra records when 1.5M records are added to a blank input table

2016-12-25 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1553#comment-1553
 ] 

Bhupesh Chawda commented on APEXMALHAR-2368:


[~Hitesh_] Can you please give some background on what the proposed PR is 
supposed to do and what the problem was?

> JDBCPollInput operator reads extra records when 1.5M records are added to a 
> blank input table
> -
>
> Key: APEXMALHAR-2368
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2368
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
>
> JDBCPollInput operator reads extra records when 1.5 million new records are 
> added to the input table.
> Operator information:
> Operator location: malhar-library
> Available since: 3.5.0
> Operator state: Evolving
> Operator: 
> com.datatorrent.lib.db.jdbc.AbstractJdbcPollInputOperator
> com.datatorrent.lib.db.jdbc.JdbcPOJOPollInputOperator
> Observed only when >=1.5 million records are inserted into the table.
> Not observed with 1million records



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2373) JdbcOutputOperator does not handle NUMERIC type

2016-12-21 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15766713#comment-15766713
 ] 

Bhupesh Chawda commented on APEXMALHAR-2373:


Same should also be handled in Jdbc Input Operators.

> JdbcOutputOperator does not handle NUMERIC type
> ---
>
> Key: APEXMALHAR-2373
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2373
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>
> Database: Oracle
> Column type NUMERIC gives following exception in JdbcPOJOInsertOutputOperator:
> 2016-12-19 18:42:26,032 DEBUG jdbc.JdbcPOJOInsertOutputOperator 
> (JdbcPOJOInsertOutputOperator.java:populateColumnDataTypes(162)) - resultSet 
> MetaData column count 3
> 2016-12-19 18:42:26,033 DEBUG jdbc.JdbcPOJOInsertOutputOperator 
> (JdbcPOJOInsertOutputOperator.java:populateColumnDataTypes(170)) - column 
> name ACCOUNT_NO type 2
> 2016-12-19 18:42:26,033 DEBUG jdbc.JdbcPOJOInsertOutputOperator 
> (JdbcPOJOInsertOutputOperator.java:populateColumnDataTypes(170)) - column 
> name NAME type 12
> 2016-12-19 18:42:26,033 DEBUG jdbc.JdbcPOJOInsertOutputOperator 
> (JdbcPOJOInsertOutputOperator.java:populateColumnDataTypes(170)) - column 
> name AMOUNT type 2
> 2016-12-19 18:42:26,033 DEBUG engine.StreamingContainer 
> (StreamingContainer.java:setupNode(1333)) - activating 2 in container 
> container_e15_1482167280022_0017_01_02
> 2016-12-19 18:42:26,033 DEBUG jdbc.JdbcPOJOInsertOutputOperator 
> (JdbcPOJOInsertOutputOperator.java:activate(134)) - insert statement is 
> INSERT INTO test_output_event_table (ACCOUNT_NO,NAME,AMOUNT) VALUES (?,?,?)
> 2016-12-19 18:42:26,034 ERROR engine.StreamingContainer 
> (StreamingContainer.java:run(1431)) - Abandoning deployment of operator 
> OperatorDeployInfo[id=2,name=JdbcOutput,type=GENERIC,checkpoint={,
>  0, 
> 0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=POJO's,sourceNodeId=1,sourcePortName=outputPort,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[]]
>  due to setup failure.
> java.lang.RuntimeException: unsupported data type 2
> at 
> com.datatorrent.lib.db.jdbc.AbstractJdbcPOJOOutputOperator.handleUnknownDataType(AbstractJdbcPOJOOutputOperator.java:177)
> at 
> com.datatorrent.lib.db.jdbc.AbstractJdbcPOJOOutputOperator.activate(AbstractJdbcPOJOOutputOperator.java:292)
> at 
> com.datatorrent.lib.db.jdbc.JdbcPOJOInsertOutputOperator.activate(JdbcPOJOInsertOutputOperator.java:136)
> at 
> com.datatorrent.lib.db.jdbc.JdbcPOJOInsertOutputOperator.activate(JdbcPOJOInsertOutputOperator.java:47)
> at com.datatorrent.stram.engine.Node.activate(Node.java:619)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2373) JdbcOutputOperator does not handle NUMERIC type

2016-12-21 Thread Bhupesh Chawda (JIRA)
Bhupesh Chawda created APEXMALHAR-2373:
--

 Summary: JdbcOutputOperator does not handle NUMERIC type
 Key: APEXMALHAR-2373
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2373
 Project: Apache Apex Malhar
  Issue Type: Bug
Reporter: Bhupesh Chawda


Database: Oracle
Column type NUMERIC gives following exception in JdbcPOJOInsertOutputOperator:

2016-12-19 18:42:26,032 DEBUG jdbc.JdbcPOJOInsertOutputOperator 
(JdbcPOJOInsertOutputOperator.java:populateColumnDataTypes(162)) - resultSet 
MetaData column count 3
2016-12-19 18:42:26,033 DEBUG jdbc.JdbcPOJOInsertOutputOperator 
(JdbcPOJOInsertOutputOperator.java:populateColumnDataTypes(170)) - column name 
ACCOUNT_NO type 2
2016-12-19 18:42:26,033 DEBUG jdbc.JdbcPOJOInsertOutputOperator 
(JdbcPOJOInsertOutputOperator.java:populateColumnDataTypes(170)) - column name 
NAME type 12
2016-12-19 18:42:26,033 DEBUG jdbc.JdbcPOJOInsertOutputOperator 
(JdbcPOJOInsertOutputOperator.java:populateColumnDataTypes(170)) - column name 
AMOUNT type 2
2016-12-19 18:42:26,033 DEBUG engine.StreamingContainer 
(StreamingContainer.java:setupNode(1333)) - activating 2 in container 
container_e15_1482167280022_0017_01_02
2016-12-19 18:42:26,033 DEBUG jdbc.JdbcPOJOInsertOutputOperator 
(JdbcPOJOInsertOutputOperator.java:activate(134)) - insert statement is INSERT 
INTO test_output_event_table (ACCOUNT_NO,NAME,AMOUNT) VALUES (?,?,?)
2016-12-19 18:42:26,034 ERROR engine.StreamingContainer 
(StreamingContainer.java:run(1431)) - Abandoning deployment of operator 
OperatorDeployInfo[id=2,name=JdbcOutput,type=GENERIC,checkpoint={,
 0, 
0},inputs=[OperatorDeployInfo.InputDeployInfo[portName=input,streamId=POJO's,sourceNodeId=1,sourcePortName=outputPort,locality=CONTAINER_LOCAL,partitionMask=0,partitionKeys=]],outputs=[]]
 due to setup failure.
java.lang.RuntimeException: unsupported data type 2
at 
com.datatorrent.lib.db.jdbc.AbstractJdbcPOJOOutputOperator.handleUnknownDataType(AbstractJdbcPOJOOutputOperator.java:177)
at 
com.datatorrent.lib.db.jdbc.AbstractJdbcPOJOOutputOperator.activate(AbstractJdbcPOJOOutputOperator.java:292)
at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInsertOutputOperator.activate(JdbcPOJOInsertOutputOperator.java:136)
at 
com.datatorrent.lib.db.jdbc.JdbcPOJOInsertOutputOperator.activate(JdbcPOJOInsertOutputOperator.java:47)
at com.datatorrent.stram.engine.Node.activate(Node.java:619)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2371) Importing 'Apache Apex Malhar Iteration Demo' throws error for 'property' tag in properties.xml

2016-12-19 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2371.

   Resolution: Fixed
Fix Version/s: 3.7.0

> Importing 'Apache Apex Malhar Iteration Demo' throws error for 'property' tag 
> in properties.xml
> ---
>
> Key: APEXMALHAR-2371
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2371
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Matt Zhang
>Assignee: Matt Zhang
> Fix For: 3.7.0
>
>
> Importing 'Apache Apex Malhar Iteration Demo' throws error for 'property' tag 
> in properties.xml.
> Error says:
> {code}
> [Fatal Error] properties.xml:44:3: The element type "property" must be 
> terminated by the matching end-tag "".
> 2016-06-29 03:05:17,349 WARN com.datatorrent.stram.client.AppPackage: 
> Ignoring META_INF/properties.xml because of error
> org.xml.sax.SAXParseException; systemId: 
> file:/home/dttbc/.dt/cache/dttbc/iteration-demo/3.4.0/apa/META-INF/properties.xml;
>  lineNumber: 44; columnNumber: 3; The element type "property" must be 
> terminated by the matching end-tag "".
>   at 
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
>   at 
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)
>   at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
>   at 
> com.datatorrent.stram.client.DTConfiguration.loadFile(DTConfiguration.java:164)
>   at 
> com.datatorrent.stram.client.DTConfiguration.loadFile(DTConfiguration.java:186)
>   at 
> com.datatorrent.stram.client.AppPackage.processPropertiesXml(AppPackage.java:430)
>   at com.datatorrent.stram.client.AppPackage.(AppPackage.java:172)
>   at 
> com.datatorrent.gateway.resources.ws.v2.AppPackagesLocalCache.syncPackage(dc:290)
>   at 
> com.datatorrent.gateway.resources.ws.v2.AppPackagesResource.importAppPackage(vc:1783)
>   at 
> com.datatorrent.gateway.resources.ws.v2.AppPackagesResource.importAppPackages(vc:529)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60)
>   at 
> com.sun.jersey.server.impl.model.method.dispatch.AbstractResourceMethodDispatchProvider$ResponseOutInvoker._dispatch(AbstractResourceMethodDispatchProvider.java:205)
>   at 
> com.sun.jersey.server.impl.model.method.dispatch.ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.java:75)
>   at 
> com.sun.jersey.server.impl.uri.rules.HttpMethodRule.accept(HttpMethodRule.java:288)
>   at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>   at 
> com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:134)
>   at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>   at 
> com.sun.jersey.server.impl.uri.rules.SubLocatorRule.accept(SubLocatorRule.java:134)
>   at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>   at 
> com.sun.jersey.server.impl.uri.rules.ResourceClassRule.accept(ResourceClassRule.java:108)
>   at 
> com.sun.jersey.server.impl.uri.rules.RightHandPathRule.accept(RightHandPathRule.java:147)
>   at 
> com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule.accept(RootResourceClassesRule.java:84)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1469)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl._handleRequest(WebApplicationImpl.java:1400)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1349)
>   at 
> com.sun.jersey.server.impl.application.WebApplicationImpl.handleRequest(WebApplicationImpl.java:1339)
>   at 
> com.sun.jersey.spi.container.servlet.WebComponent.service(WebComponent.java:416)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:537)
>   at 
> com.sun.jersey.spi.container.servlet.ServletContainer.service(ServletContainer.java:699)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
>   at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:669)
>   at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:457)
>   

[jira] [Resolved] (APEXMALHAR-2316) Cannot register tuple class in XmlParser Operator

2016-12-02 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2316.

   Resolution: Fixed
Fix Version/s: 3.7.0

> Cannot register tuple class in XmlParser Operator
> -
>
> Key: APEXMALHAR-2316
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2316
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.7.0
>
>
> Unable to set the tuple class attribute, getting the following stack trace 
> Abandoning deployment due to setup failure. java.lang.IllegalArgumentException
>   at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:635)
>   at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:584)
>   at com.datatorrent.lib.parser.XmlParser.setup(XmlParser.java:135)
>   at com.datatorrent.lib.parser.XmlParser.setup(XmlParser.java:63)
>   at com.datatorrent.stram.engine.Node.setup(Node.java:187)
>   at 
> com.datatorrent.stram.engine.StreamingContainer.setupNode(StreamingContainer.java:1309)
>   at 
> com.datatorrent.stram.engine.StreamingContainer.access$100(StreamingContainer.java:130)
>   at 
> com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1388).
> The problem is we are trying to use TUPLE_CLASS before it is properly 
> initialized.To fix this issue some part of code should be moved to activate() 
> from setup(). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2346) DocumentBuilder.parse() should take InputSource as an argument instead of String

2016-12-02 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2346.

   Resolution: Fixed
Fix Version/s: 3.7.0

> DocumentBuilder.parse() should take InputSource as an argument instead of 
> String
> 
>
> Key: APEXMALHAR-2346
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2346
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.7.0
>
>
> The argument to DocumentBuilder.parse() needs to be fixed as that method does 
> not take an XML string. The fix is, easy and straightforward, to make an 
> InputSource out of the tuple.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2344) Initialize the list of FieldInfo in JDBCPollInput operator from properties.xml

2016-12-01 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2344.

   Resolution: Fixed
Fix Version/s: 3.7.0

> Initialize the list of FieldInfo in JDBCPollInput operator from properties.xml
> --
>
> Key: APEXMALHAR-2344
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2344
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.7.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-581) Delivery of Custom Control Tuples

2016-12-01 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15711794#comment-15711794
 ] 

Bhupesh Chawda commented on APEXCORE-581:
-

Do we need to store the control tuples in the buffer server.
I was thinking that the buffer server flow for all the tuples (including 
control tuples) would be as it is. We can intercept the tuples in GenericNode 
and buffer them there.
Once consolidated, we could put them into the sinks at the window boundaries.

> Delivery of Custom Control Tuples
> -
>
> Key: APEXCORE-581
> URL: https://issues.apache.org/jira/browse/APEXCORE-581
> Project: Apache Apex Core
>  Issue Type: Sub-task
>Reporter: David Yan
>
> The behavior should be as follow:
> - The control tuples should only be sent to downstream at streaming window 
> boundaries
> - The control tuples should be sent to all partitions downstream
> - The control tuples should be sent in the same order of arrival.
> - Within a streaming window, do not send the same control tuple twice, even 
> if the same control tuple is received multiple times within that window. This 
> is possible if the operator has two input ports. (The LinkedHashMap should be 
> easily able to ensure both order and uniqueness.)
> - The delivery of control tuples needs to stop at DelayOperator. 
> - When a streaming window is committed, remove the associated LinkedHashMap 
> that belong to windows with IDs that are less than the committed window
> - It's safe to assume the control tuples are rare enough and can fit in memory
> This will involve an additional MessageType to represent a custom control 
> tuple. 
> We probably need to have a data structure (possibly a LinkedHashMap) per 
> streaming window that stores the control tuple in the buffer server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2022) S3 Output Module for file copy

2016-11-30 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2022.

   Resolution: Fixed
Fix Version/s: 3.7.0

> S3 Output Module for file copy
> --
>
> Key: APEXMALHAR-2022
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2022
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: Chaitanya
>Assignee: Chaitanya
> Fix For: 3.7.0
>
>
> Primary functionality of this module is copy files into S3 bucket using 
> block-by-block approach.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2357) JdbcPojoOperatorApplicationTest failing intermittently

2016-11-28 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2357.

   Resolution: Fixed
Fix Version/s: 3.7.0

> JdbcPojoOperatorApplicationTest failing intermittently 
> ---
>
> Key: APEXMALHAR-2357
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2357
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.7.0
>
>
> JdbcPojoOperatorApplicationTest was added as a part of APEXMALHAR-2340. This 
> test is failing intermittently. The condition to exit the test, exits when 
> emitted tuples are equal to 10 and the check to pass the test is that these 
> tuples should be added to the table. The test was exiting before adding it in 
> table and hence there was a failure.
> The fix is straight forward and to change the exit criteria, it should exit 
> when the tuples are added to the table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2340) Initialize the list of JdbcFieldInfo in JdbcPOJOInsertOutput from properties.xml

2016-11-24 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2340.

   Resolution: Fixed
Fix Version/s: 3.6.0

> Initialize the list of JdbcFieldInfo in JdbcPOJOInsertOutput from 
> properties.xml
> 
>
> Key: APEXMALHAR-2340
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2340
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.6.0
>
>
> Currently the list of JdbcFieldInfo is populated using java code.
> This should be done using properties.xml file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2353) timeExpression should not be null for time based Dedup

2016-11-23 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2353:
---
Description: 
Time Based Dedup has timeExpression as optional. It has to be supplied by the 
user.

In the current setting, if the user does not specify a timeExpression, then a 
different time (System time) will be passed for each tuple, irrespective of 
whether the tuple is a duplicate or a unique. This is a bug since even the 
duplicate tuples may fall in different buckets and will be concluded as unique. 

  was:Time Based Dedup has timeExpression as optional. It has to be supplied by 
the user.


> timeExpression should not be null for time based Dedup  
> 
>
> Key: APEXMALHAR-2353
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2353
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>Priority: Minor
>
> Time Based Dedup has timeExpression as optional. It has to be supplied by the 
> user.
> In the current setting, if the user does not specify a timeExpression, then a 
> different time (System time) will be passed for each tuple, irrespective of 
> whether the tuple is a duplicate or a unique. This is a bug since even the 
> duplicate tuples may fall in different buckets and will be concluded as 
> unique. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2353) timeExpression should not be null for time based Dedup

2016-11-23 Thread Bhupesh Chawda (JIRA)
Bhupesh Chawda created APEXMALHAR-2353:
--

 Summary: timeExpression should not be null for time based Dedup  
 Key: APEXMALHAR-2353
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2353
 Project: Apache Apex Malhar
  Issue Type: Bug
Reporter: Bhupesh Chawda
Assignee: Bhupesh Chawda
Priority: Minor


Time Based Dedup has timeExpression as optional. It has to be supplied by the 
user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2284) POJOInnerJoinOperatorTest fails in Travis CI

2016-11-11 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15657669#comment-15657669
 ] 

Bhupesh Chawda commented on APEXMALHAR-2284:


[~chaithu] If you have addressed the correctness aspect, then our priority 
should be to unblock the release.
You can create new JIRAs for the improvement in the class hierarchy for Managed 
state and Spillable structures and keep working on it. 
Meanwhile let us get the current PR to a mergeable state.

[~csingh] [~thw] What do you think?

> POJOInnerJoinOperatorTest fails in Travis CI
> 
>
> Key: APEXMALHAR-2284
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2284
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Thomas Weise
>Assignee: Chaitanya
>Priority: Blocker
> Fix For: 3.6.0
>
>
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/166322754/log.txt
> {code}
> Failed tests: 
>   POJOInnerJoinOperatorTest.testEmitMultipleTuplesFromStream2:337 Number of 
> tuple emitted  expected:<2> but was:<4>
>   POJOInnerJoinOperatorTest.testInnerJoinOperator:184 Number of tuple emitted 
>  expected:<1> but was:<2>
>   POJOInnerJoinOperatorTest.testMultipleValues:236 Number of tuple emitted  
> expected:<2> but was:<3>
>   POJOInnerJoinOperatorTest.testUpdateStream1Values:292 Number of tuple 
> emitted  expected:<1> but was:<2>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2284) POJOInnerJoinOperatorTest fails in Travis CI

2016-10-26 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610758#comment-15610758
 ] 

Bhupesh Chawda commented on APEXMALHAR-2284:


[~chaithu] I think we should go ahead with the approach you are suggesting.
Since this is a blocker for the next malhar release, we should get it done asap.

> POJOInnerJoinOperatorTest fails in Travis CI
> 
>
> Key: APEXMALHAR-2284
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2284
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Thomas Weise
>Assignee: Chaitanya
>Priority: Blocker
> Fix For: 3.6.0
>
>
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/166322754/log.txt
> {code}
> Failed tests: 
>   POJOInnerJoinOperatorTest.testEmitMultipleTuplesFromStream2:337 Number of 
> tuple emitted  expected:<2> but was:<4>
>   POJOInnerJoinOperatorTest.testInnerJoinOperator:184 Number of tuple emitted 
>  expected:<1> but was:<2>
>   POJOInnerJoinOperatorTest.testMultipleValues:236 Number of tuple emitted  
> expected:<2> but was:<3>
>   POJOInnerJoinOperatorTest.testUpdateStream1Values:292 Number of tuple 
> emitted  expected:<1> but was:<2>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2291) Exactly-once processing not working correctly for JdbcPOJOInsertOutputOperator

2016-10-26 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2291.

   Resolution: Fixed
Fix Version/s: 3.6.0

> Exactly-once processing not working correctly for JdbcPOJOInsertOutputOperator
> --
>
> Key: APEXMALHAR-2291
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2291
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Hitesh Kapoor
>Assignee: Hitesh Kapoor
> Fix For: 3.6.0
>
>
> It is observed the exactly-once processing is not working correctly. When the 
> operator crashes, it comes back and inserts data again for non-checkpointed 
> windows



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2309) TimeBasedDedupOperator marks new tuples as duplicates if expired tuples exist

2016-10-26 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2309.

   Resolution: Fixed
Fix Version/s: 3.6.0

> TimeBasedDedupOperator marks new tuples as duplicates if expired tuples exist
> -
>
> Key: APEXMALHAR-2309
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2309
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Francis Fernandes
>Assignee: Francis Fernandes
> Fix For: 3.6.0
>
>
> The deduper marks valid tuples outside the expiry window as duplicates. 
> Consider the following configuration (number of buckets = 1 )
> {code}
>   
> 
> dt.application.DedupTestApp.operator.Deduper.prop.expireBefore
> 10
>   
>   
> dt.application.DedupTestApp.operator.Deduper.prop.bucketSpan
> 10
>   
> {code}
> The data piped in is : 
> {code}
> "10",1474614305000,"Test"
> "11",1474614315000,"Test"
> "10",1474614325000,"Test"
> {code}
> The 3rd tuple is valid since it is outside of the expiry window. But it is 
> marked as duplicate because although the first tuple although expired is 
> still present in the Bucket.flash.
> The issue happens when the expiry duration lesser than the checkpointing 
> duration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2315) Ignore Join Test for because of issues in POJOInnerJoinOperator

2016-10-26 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2315.

   Resolution: Fixed
Fix Version/s: 3.6.0

> Ignore Join Test for because of issues in POJOInnerJoinOperator
> ---
>
> Key: APEXMALHAR-2315
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2315
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Chinmay Kolhatkar
>Assignee: Chinmay Kolhatkar
>Priority: Minor
> Fix For: 3.6.0
>
>
> Ignore Join Test for because of issues in POJOInnerJoinOperator
> sql integration uses POJOInnerJoinOperator because of which issues occuring 
> because of managed state and inner join operator interaction is causing this 
> test to fail.
> There is a Jira (APEXMALHAR-2295) present to replace POJOInnerJoinOperator 
> with WIndowed variant of it. As a part of that, these tests should be resumed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2284) POJOInnerJoinOperatorTest fails in Travis CI

2016-10-19 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15588218#comment-15588218
 ] 

Bhupesh Chawda commented on APEXMALHAR-2284:


If ManagedTimeStateMultiValue and SpillableArrayListMultiMap are different 
functionalities, perhaps adding time support to SpillableArrayListMultiMap 
might change the functionality it is currently providing. 
As far as I can see, each uses a different state - BucketedState (for 
SpillableArrayListMultiMap) vs TimeSlicedBucketedState (for 
ManagedTimeStateMultiValue) and should not be mixed.

[~csingh], [~thw], thoughts?


> POJOInnerJoinOperatorTest fails in Travis CI
> 
>
> Key: APEXMALHAR-2284
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2284
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Thomas Weise
>Assignee: Chaitanya
>Priority: Blocker
> Fix For: 3.6.0
>
>
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/166322754/log.txt
> {code}
> Failed tests: 
>   POJOInnerJoinOperatorTest.testEmitMultipleTuplesFromStream2:337 Number of 
> tuple emitted  expected:<2> but was:<4>
>   POJOInnerJoinOperatorTest.testInnerJoinOperator:184 Number of tuple emitted 
>  expected:<1> but was:<2>
>   POJOInnerJoinOperatorTest.testMultipleValues:236 Number of tuple emitted  
> expected:<2> but was:<3>
>   POJOInnerJoinOperatorTest.testUpdateStream1Values:292 Number of tuple 
> emitted  expected:<1> but was:<2>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2284) POJOInnerJoinOperatorTest fails in Travis CI

2016-10-18 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585604#comment-15585604
 ] 

Bhupesh Chawda commented on APEXMALHAR-2284:


As far as I understand, the issue is with race conditions involved in 
simultaneous get() and put() on a store based on managed state.
For the join use case, asynchronous get() is important for efficiency reasons.

Upon some discussion with Chaitanya, here are the possible options:
1. Make all get() and put() calls blocking. That way the issue is avoided at 
the cost of decreased throughput.
2. Allow an option to specify the source (Bucket.ReadSource).
3  All data inserted by the put() call is stored in a separate space and 
synched with managed state at end window. This ensures that put() does not 
interfere with asynchronous get() calls. This is similar to the way 
SpillableArrayListMultimap works, but which cannot be used directly as it lacks 
some functionality like asynchronous get.
4. Override the getAsync() method to achieve the desired functionality.


> POJOInnerJoinOperatorTest fails in Travis CI
> 
>
> Key: APEXMALHAR-2284
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2284
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Thomas Weise
>Assignee: Chaitanya
>Priority: Blocker
> Fix For: 3.6.0
>
>
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/166322754/log.txt
> {code}
> Failed tests: 
>   POJOInnerJoinOperatorTest.testEmitMultipleTuplesFromStream2:337 Number of 
> tuple emitted  expected:<2> but was:<4>
>   POJOInnerJoinOperatorTest.testInnerJoinOperator:184 Number of tuple emitted 
>  expected:<1> but was:<2>
>   POJOInnerJoinOperatorTest.testMultipleValues:236 Number of tuple emitted  
> expected:<2> but was:<3>
>   POJOInnerJoinOperatorTest.testUpdateStream1Values:292 Number of tuple 
> emitted  expected:<1> but was:<2>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXCORE-560) Logical plan is not changed when all physical partitions of operator are removed from DAG

2016-10-14 Thread Bhupesh Chawda (JIRA)
Bhupesh Chawda created APEXCORE-560:
---

 Summary: Logical plan is not changed when all physical partitions 
of operator are removed from DAG
 Key: APEXCORE-560
 URL: https://issues.apache.org/jira/browse/APEXCORE-560
 Project: Apache Apex Core
  Issue Type: Bug
Reporter: Bhupesh Chawda


Throwing a ```ShutdownException()``` from an input operator removes them from 
the physical plan, but can still be seen in the logical plan. Ideally the 
corresponding logical operator must also be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2258) JavaExpressionParser does not cast type correctly when expression is binary

2016-09-21 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2258.

   Resolution: Fixed
Fix Version/s: 3.6.0

Merged

> JavaExpressionParser does not cast type correctly when expression is binary
> ---
>
> Key: APEXMALHAR-2258
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2258
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Chinmay Kolhatkar
>Assignee: Chinmay Kolhatkar
> Fix For: 3.6.0
>
>
> JavaExpressionParser does not cast type correctly when expression is binary.
> It should cast total of resolved expression to return type instead of just 
> first part of expression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2256) POJOInnerJoinOperator should use getDeclaredField of java reflection

2016-09-21 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2256.

Resolution: Fixed

Merged

> POJOInnerJoinOperator should use getDeclaredField of java reflection
> 
>
> Key: APEXMALHAR-2256
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2256
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Chinmay Kolhatkar
>Assignee: Chinmay Kolhatkar
> Fix For: 3.6.0
>
>
> POJOInnerJoinOperator should use getDeclaredField of java reflection. 
> Currently its using getField because of which unless the POJO fields are 
> public, they're not see by reflection. In reality the fields in POJO will be 
> private and getters/setters will be public.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2256) POJOInnerJoinOperator should use getDeclaredField of java reflection

2016-09-21 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2256:
---
Fix Version/s: 3.6.0

> POJOInnerJoinOperator should use getDeclaredField of java reflection
> 
>
> Key: APEXMALHAR-2256
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2256
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Affects Versions: 3.5.0
>Reporter: Chinmay Kolhatkar
>Assignee: Chinmay Kolhatkar
> Fix For: 3.6.0
>
>
> POJOInnerJoinOperator should use getDeclaredField of java reflection. 
> Currently its using getField because of which unless the POJO fields are 
> public, they're not see by reflection. In reality the fields in POJO will be 
> private and getters/setters will be public.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (APEXMALHAR-2206) Some Application tests taking too long

2016-08-26 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438856#comment-15438856
 ] 

Bhupesh Chawda edited comment on APEXMALHAR-2206 at 8/26/16 12:59 PM:
--

Also one of the app tests in Deduper did not suhtdown the application. Added 
this fix to the PR.


was (Author: bhupesh):
Also one of the app tests did not suhtdown the application.

> Some Application tests taking too long
> --
>
> Key: APEXMALHAR-2206
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2206
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>Priority: Minor
> Fix For: 3.5.0
>
>
> Some Application Tests seems to be running for a long time.
> Additionally the size of the target folder increases too much ~ 2GB



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2206) Some Application tests taking too long

2016-08-26 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2206:
---
Description: 
Some Application Tests seems to be running for a long time.
Additionally the size of the target folder increases too much ~ 2GB

  was:Deduper Application Test seem to be taking too long intermittently.


> Some Application tests taking too long
> --
>
> Key: APEXMALHAR-2206
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2206
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>Priority: Minor
> Fix For: 3.5.0
>
>
> Some Application Tests seems to be running for a long time.
> Additionally the size of the target folder increases too much ~ 2GB



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2206) Some Application tests taking too long

2016-08-26 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2206:
---
Summary: Some Application tests taking too long  (was: Deduper Application 
test taking too long)

> Some Application tests taking too long
> --
>
> Key: APEXMALHAR-2206
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2206
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>Priority: Minor
> Fix For: 3.5.0
>
>
> Deduper Application Test seem to be taking too long intermittently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2206) Deduper Application test taking too long

2016-08-26 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2206:
---
Fix Version/s: 3.5.0

> Deduper Application test taking too long
> 
>
> Key: APEXMALHAR-2206
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2206
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>Priority: Minor
> Fix For: 3.5.0
>
>
> Deduper Application Test seem to be taking too long intermittently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2206) Deduper Application test taking too long

2016-08-26 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15438803#comment-15438803
 ] 

Bhupesh Chawda commented on APEXMALHAR-2206:


The input data generator operator does not limit the amount of data per window.
It might be the case that, in local mode, the thread which is supposed to call 
processstats() gets delayed and the test takes a long time.

> Deduper Application test taking too long
> 
>
> Key: APEXMALHAR-2206
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2206
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>Priority: Minor
>
> Deduper Application Test seem to be taking too long intermittently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (APEXMALHAR-2206) Deduper Application test taking too long

2016-08-26 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda reassigned APEXMALHAR-2206:
--

Assignee: Bhupesh Chawda

> Deduper Application test taking too long
> 
>
> Key: APEXMALHAR-2206
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2206
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>Priority: Minor
>
> Deduper Application Test seem to be taking too long intermittently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2206) Deduper Application test taking too long

2016-08-26 Thread Bhupesh Chawda (JIRA)
Bhupesh Chawda created APEXMALHAR-2206:
--

 Summary: Deduper Application test taking too long
 Key: APEXMALHAR-2206
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2206
 Project: Apache Apex Malhar
  Issue Type: Bug
Reporter: Bhupesh Chawda
Priority: Minor


Deduper Application Test seem to be taking too long intermittently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2129) ManagedState: Add a disable purging option

2016-08-12 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2129.

   Resolution: Done
Fix Version/s: 3.5.0

Merged

> ManagedState: Add a disable purging option
> --
>
> Key: APEXMALHAR-2129
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2129
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: Bhupesh Chawda
>Assignee: Chandni Singh
> Fix For: 3.5.0
>
>
> Have an option that can disable purging of data.
> Currently the TimeBucketAssigner moves time boundary periodically based on 
> system time as well as based on event time. This is error prone and moving 
> time boundary periodically implies that data will always be purged. 
> The change that needs to be made to TimeBucketAssigner is to just move time 
> boundary when there are future events that don't fall in that boundary. This 
> will imply that if there are no events outside the current boundary then data 
> will not be purged.
> ManagedStateImpl uses processing time for an event, so moving boundaries 
> based on system time is going to happen there nevertheless.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2172) Update JDBC poll input operator to fix issues

2016-08-11 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2172.

Resolution: Fixed

Merged

> Update JDBC poll input operator to fix issues
> -
>
> Key: APEXMALHAR-2172
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2172
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Priyanka Gugale
>Assignee: Priyanka Gugale
> Fix For: 3.5.0
>
>
> Update JDBCPollInputOperator to:
> 1. Fix small bugs
> 2. Use jooq query dsl library to construct sql queries
> 3. Make code more readable
> 4. Use row counts rather than key column values to partition reads



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2172) Update JDBC poll input operator to fix issues

2016-08-11 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2172:
---
Fix Version/s: 3.5.0

> Update JDBC poll input operator to fix issues
> -
>
> Key: APEXMALHAR-2172
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2172
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Priyanka Gugale
>Assignee: Priyanka Gugale
> Fix For: 3.5.0
>
>
> Update JDBCPollInputOperator to:
> 1. Fix small bugs
> 2. Use jooq query dsl library to construct sql queries
> 3. Make code more readable
> 4. Use row counts rather than key column values to partition reads



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2185) Add a Deduper implementation for Bounded data

2016-08-11 Thread Bhupesh Chawda (JIRA)
Bhupesh Chawda created APEXMALHAR-2185:
--

 Summary: Add a Deduper implementation for Bounded data
 Key: APEXMALHAR-2185
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2185
 Project: Apache Apex Malhar
  Issue Type: Task
Reporter: Bhupesh Chawda
Assignee: Bhupesh Chawda
Priority: Minor


We have the Abstract and the Time based implementation in place.
Need to add an implementation which can handle bounded data case. Such cases, 
may not have a time field and would require to dedup across the entire incoming 
data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2177) Deduper: Create a windowed dedup operator

2016-08-07 Thread Bhupesh Chawda (JIRA)
Bhupesh Chawda created APEXMALHAR-2177:
--

 Summary: Deduper: Create a windowed dedup operator
 Key: APEXMALHAR-2177
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2177
 Project: Apache Apex Malhar
  Issue Type: Task
Reporter: Bhupesh Chawda
Priority: Minor


Create a Deduper based on Windowed Operator. After exploring a little bit, it 
seems it will be a good idea to have a windowed deduper once we have the 
storage mechanisms in Windowed operator in place.

Here is the PR which has the basic implementation: 
https://github.com/apache/apex-malhar/pull/343




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2153) Add user documentation for Enricher on apex docs

2016-08-02 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2153.

   Resolution: Done
Fix Version/s: 3.5.0

Merged

> Add user documentation for Enricher on apex docs
> 
>
> Key: APEXMALHAR-2153
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2153
> Project: Apache Apex Malhar
>  Issue Type: Documentation
>Reporter: Chinmay Kolhatkar
>Assignee: Chinmay Kolhatkar
> Fix For: 3.5.0
>
>
> Add user documentation for Enricher on apex docs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2129) ManagedState: Add a disable purging option

2016-08-01 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15401886#comment-15401886
 ] 

Bhupesh Chawda commented on APEXMALHAR-2129:


As discussed on dev@apex, we are proceeding with implementing deduper with 
managed state. In order to handle case of bounded data without any time field, 
we will need to have the following functionality in the TimeBucketAssigner:
1. Disable purging
2. Disable expiry of tuples - We will need to store and dedup across the entire 
data set.

> ManagedState: Add a disable purging option
> --
>
> Key: APEXMALHAR-2129
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2129
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: Bhupesh Chawda
>Assignee: Chandni Singh
>
> Have an option that can disable purging of data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2066) Add jdbc poller input operator

2016-07-15 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2066.

   Resolution: Done
Fix Version/s: 3.5.0

> Add jdbc poller input operator
> --
>
> Key: APEXMALHAR-2066
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2066
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: Ashwin Chandra Putta
>Assignee: devendra tagare
> Fix For: 3.5.0
>
>
> Create a JDBC poller input operator that has the following features.
> 1. poll from external jdbc store asynchronously in the input operator.
> 2. polling frequency and batch size should be configurable.
> 3. should be idempotent.
> 4. should be partition-able.
> 5. should be batch + polling capable.
> Assumptions for idempotency & partitioning,
> 1.User needs to provide tableName,dbConnection,setEmitColumnList,look-up key.
> 2.Optionally batchSize,pollInterval,Look-up key and a where clause can be 
> given.
> 3.This operator uses static partitioning to arrive at range queries for 
> exactly once reads.
> This operator will create a configured number of non-polling static 
> partitions for fetching the existing data in the table. And an additional
> single partition for polling additive data.
> 4.Assumption is that there is an ordered column using which range queries can 
> be formed.
> The *key* column, based on which the polling will happen, is any column which 
> has ever increasing values and supports greater than and less
> than operations in SQL. 
> 5.If an emitColumnList is provided, please ensure that the keyColumn is the 
> first column in the list
> 6.Range queries are formed using the JdbcMetaDataUtility Output - comma 
> separated list of the emit columns eg columnA,columnB,columnC
> 7. Only newly added data which has increasing ids will be fetched by the
>polling jdbc partition
> Per window the first and the last key processed is saved using the 
> FSWindowDataManager - (,operatorId,windowId).This 
> (lowerBound,upperBoundPair) is then used for recovery.The queries are 
> constructed using the JDBCMetaDataUtility.
> JDBCMetaDataUtility
> A utility class used to retrieve the metadata for a given unique key of a SQL 
> table. This class would emit range queries based on a primary index given.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-1966) Cassandra output operator improvements

2016-07-01 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-1966:
---
Fix Version/s: 3.5.0

> Cassandra output operator improvements
> --
>
> Key: APEXMALHAR-1966
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-1966
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Priyanka Gugale
>Assignee: Priyanka Gugale
> Fix For: 3.5.0
>
>
> Update existing Cassandra output operator to:
> 1. Accept use defined parameterized queries, the queries could be for update, 
> insert or delete.
> 2. Add error port to emit tuples which couldn't be written to database.
> 3. Add metrics
> 4. Provide a way to restrict batch size



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-1966) Cassandra output operator improvements

2016-07-01 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-1966.

Resolution: Done

Merged

> Cassandra output operator improvements
> --
>
> Key: APEXMALHAR-1966
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-1966
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: Priyanka Gugale
>Assignee: Priyanka Gugale
> Fix For: 3.5.0
>
>
> Update existing Cassandra output operator to:
> 1. Accept use defined parameterized queries, the queries could be for update, 
> insert or delete.
> 2. Add error port to emit tuples which couldn't be written to database.
> 3. Add metrics
> 4. Provide a way to restrict batch size



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2129) ManagedState: Add a disable purging option

2016-06-30 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356775#comment-15356775
 ] 

Bhupesh Chawda commented on APEXMALHAR-2129:


Okay, sure.
Just another point: Disabling purging may retain data in managed state. We must 
also make sure the data never expires. Something like setting a very large 
expiry perhaps?

> ManagedState: Add a disable purging option
> --
>
> Key: APEXMALHAR-2129
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2129
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: Bhupesh Chawda
>Assignee: Chandni Singh
>
> Have an option that can disable purging of data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (APEXMALHAR-2129) Introduce option to advance time through Expiry task in TimeBucketAssigner

2016-06-29 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda closed APEXMALHAR-2129.
--
Resolution: Invalid

> Introduce option to advance time through Expiry task in TimeBucketAssigner
> --
>
> Key: APEXMALHAR-2129
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2129
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>Priority: Minor
>
> TimeBucketAssigner advances the time boundaries of the buckets viz. start and 
> end to the current system time every window.
> The requirement is to add an option so that clients can disable this if 
> needed. Tuple time based deduplication has such a requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2129) Introduce option to advance time through Expiry task in TimeBucketAssigner

2016-06-29 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15354682#comment-15354682
 ] 

Bhupesh Chawda commented on APEXMALHAR-2129:


The current implementation advances the "start" and "end" automatically every 
window by the length of a bucket span.
In case of deduplication for a data set which is not based on the current 
system time (say an year old data), this is not needed. We just need to advance 
"start" and "end" based on the tuple times, not automatically every window.

The idea is to decouple advancement of windows and purging. Purging will still 
happen, just not with the default "lowestTimeBucket" which is incremented by 1 
every window. 

> Introduce option to advance time through Expiry task in TimeBucketAssigner
> --
>
> Key: APEXMALHAR-2129
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2129
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>Priority: Minor
>
> TimeBucketAssigner advances the time boundaries of the buckets viz. start and 
> end to the current system time every window.
> The requirement is to add an option so that clients can disable this if 
> needed. Tuple time based deduplication has such a requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2129) Introduce option to advance time through Expiry task in TimeBucketAssigner

2016-06-29 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15354683#comment-15354683
 ] 

Bhupesh Chawda commented on APEXMALHAR-2129:


The ask is just to add an option to do it based on need.

> Introduce option to advance time through Expiry task in TimeBucketAssigner
> --
>
> Key: APEXMALHAR-2129
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2129
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: Bhupesh Chawda
>Assignee: Bhupesh Chawda
>Priority: Minor
>
> TimeBucketAssigner advances the time boundaries of the buckets viz. start and 
> end to the current system time every window.
> The requirement is to add an option so that clients can disable this if 
> needed. Tuple time based deduplication has such a requirement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2113) Dag fails validation due to @NotNull on getUpdateCommand() in JdbcPOJOOutputOperator

2016-06-12 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda resolved APEXMALHAR-2113.

Resolution: Fixed

> Dag fails validation due to @NotNull on getUpdateCommand() in 
> JdbcPOJOOutputOperator
> 
>
> Key: APEXMALHAR-2113
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2113
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: devendra tagare
>Assignee: devendra tagare
> Fix For: 3.5.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> getUpdateCommand(); is marked as @NotNull in the 
> AbstractJdbcTransactionableOutputOperator which is used by 
> JdbcPOJOOutputOperator.
> This method is referenced during the validation phase of DAG and 
> updateCommand is initialized only at setup.This is causing the DAG 
> initialization to fail on constraints violation.
> Stack trace below,
> An error occurred trying to launch the application. Server message: 
> javax.validation.ConstraintViolationException: Operator JdbcOutput violates 
> constraints 
> [ConstraintViolationImpl{rootBean=JdbcPOJOOutputOperator{name=null}, 
> propertyPath='updateCommand', message='may not be null', 
> leafBean=JdbcPOJOOutputOperator{name=null}, value=null}] at 
> com.datatorrent.stram.plan.logical.LogicalPlan.validate(LogicalPlan.java:1680)
>  at com.datatorrent.stram.StramClient.(StramClient.java:161) at 
> com.datatorrent.stram.client.StramAppLauncher.launchApp(StramAppLauncher.java:509)
>  at com.datatorrent.stram.cli.DTCli$LaunchCommand.execute(DTCli.java:2050) at 
> com.datatorrent.stram.cli.DTCli.launchAppPackage(DTCli.java:3456) at 
> com.datatorrent.stram.cli.DTCli.access$7100(DTCli.java:106) at 
> com.datatorrent.stram.cli.DTCli$LaunchCommand.execute(DTCli.java:1895) at 
> com.datatorrent.stram.cli.DTCli$3.run(DTCli.java:1449)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2113) Dag fails validation due to @NotNull on getUpdateCommand() in JdbcPOJOOutputOperator

2016-06-10 Thread Bhupesh Chawda (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupesh Chawda updated APEXMALHAR-2113:
---
Summary: Dag fails validation due to @NotNull on getUpdateCommand() in 
JdbcPOJOOutputOperator  (was: JdbcPOJOOutputOperator)

> Dag fails validation due to @NotNull on getUpdateCommand() in 
> JdbcPOJOOutputOperator
> 
>
> Key: APEXMALHAR-2113
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2113
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: devendra tagare
>Assignee: devendra tagare
> Fix For: 3.5.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> getUpdateCommand(); is marked as @NotNull in the 
> AbstractJdbcTransactionableOutputOperator which is used by 
> JdbcPOJOOutputOperator.
> This method is referenced during the validation phase of DAG and 
> updateCommand is initialized only at setup.This is causing the DAG 
> initialization to fail on constraints violation.
> Stack trace below,
> An error occurred trying to launch the application. Server message: 
> javax.validation.ConstraintViolationException: Operator JdbcOutput violates 
> constraints 
> [ConstraintViolationImpl{rootBean=JdbcPOJOOutputOperator{name=null}, 
> propertyPath='updateCommand', message='may not be null', 
> leafBean=JdbcPOJOOutputOperator{name=null}, value=null}] at 
> com.datatorrent.stram.plan.logical.LogicalPlan.validate(LogicalPlan.java:1680)
>  at com.datatorrent.stram.StramClient.(StramClient.java:161) at 
> com.datatorrent.stram.client.StramAppLauncher.launchApp(StramAppLauncher.java:509)
>  at com.datatorrent.stram.cli.DTCli$LaunchCommand.execute(DTCli.java:2050) at 
> com.datatorrent.stram.cli.DTCli.launchAppPackage(DTCli.java:3456) at 
> com.datatorrent.stram.cli.DTCli.access$7100(DTCli.java:106) at 
> com.datatorrent.stram.cli.DTCli$LaunchCommand.execute(DTCli.java:1895) at 
> com.datatorrent.stram.cli.DTCli$3.run(DTCli.java:1449)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-202) Integration with Samoa

2016-06-08 Thread Bhupesh Chawda (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321663#comment-15321663
 ] 

Bhupesh Chawda commented on APEXCORE-202:
-

Initial PR is open: https://github.com/apache/incubator-samoa/pull/55

> Integration with Samoa
> --
>
> Key: APEXCORE-202
> URL: https://issues.apache.org/jira/browse/APEXCORE-202
> Project: Apache Apex Core
>  Issue Type: New Feature
>Reporter: Siyuan Hua
>Assignee: Bhupesh Chawda
>  Labels: roadmap
>
> Apache Samoa[https://samoa.incubator.apache.org/] is an abstraction of a 
> collections of streaming machine learning Algorithm. By far, it has 
> integration with Samza, Storm and flink, It is a good start point for Apex to 
> support streaming ML.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)