Re: PrintWriter for LogicalPlan

2016-09-22 Thread Chinmay Kolhatkar
Thanks tushar. I got it working in similar ways:

1. For DAG as JSON String:

Map stringObjectMap =
LogicalPlanSerializer.convertToMap(dag, true);
ObjectMapper mapper = new ObjectMapper();
mapper.enable(SerializationFeature.INDENT_OUTPUT);
String s = mapper.writeValueAsString(stringObjectMap);
System.out.println(s);

2. For DAG as properties file:
StringWriter stringWriter = new StringWriter();
LogicalPlanSerializer.convertToProperties(dag).save(stringWriter);
System.out.println(stringWriter.toString());


Either works for me. Thanks for all the help.

-Chinmay.




On Thu, Sep 22, 2016 at 12:22 PM, Tushar Gosavi 
wrote:

> Hi Chinmay,
>
> take a look at following gist for example.
> https://gist.github.com/anonymous/8df8e05ffc16e620bbc030e1034da3c4
>
> - Tushar.
>
>
> On Thu, Sep 22, 2016 at 12:05 PM, Chinmay Kolhatkar
>  wrote:
> > Hi Tushar,
> >
> > Thanks for the information. Can you please provide me pointers on how to
> > use it?
> >
> > I can always examine the LogicalPlan structure but for a test case, I
> think
> > its an overkill, instead if I have a utility which can print LogicalPlan
> > (similar to LogicalPlanSerializer), then its just the matter for string
> > comparison which one has to do in test for verification.
> >
> > -Chinmay.
> >
> >
> >
> > On Thu, Sep 22, 2016 at 11:59 AM, Tushar Gosavi 
> > wrote:
> >
> >> There is a LogicalPlanSerializer in apex engine, which will generate
> >> json equivalent of DAG, which can be printed easily.
> >> But for verification check if right components are added in DAG by
> >> examining dag structure than printing it.
> >>
> >> - Tushar.
> >>
> >>
> >> On Thu, Sep 22, 2016 at 11:39 AM, Chinmay Kolhatkar
> >>  wrote:
> >> > Hi All,
> >> >
> >> > For testing of calcite generated DAG, I want to verify whether the
> >> correct
> >> > DAG is created with right operators and streams and whether properties
> >> are
> >> > set properly.
> >> >
> >> > Is there any PrintWriter for the LogicalPlan object printing
> components
> >> of
> >> > DAG?
> >> >
> >> > If not, I am planning to write one for Calcite testing.
> >> > I was wondering if that would be useful addition to apex-core for
> testing
> >> > purpose.
> >> >
> >> > Please share your opinion.
> >> >
> >> > Thanks,
> >> > Chinmay.
> >>
>


[jira] [Assigned] (APEXCORE-310) apex cli - support to kill the app by appname

2016-09-22 Thread Deepak Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Narkhede reassigned APEXCORE-310:


Assignee: Deepak Narkhede

> apex cli - support to kill the app by appname
> -
>
> Key: APEXCORE-310
> URL: https://issues.apache.org/jira/browse/APEXCORE-310
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Sandesh
>Assignee: Deepak Narkhede
>Priority: Minor
>
> dtcli should support the ability to kill the app by appname. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation

2016-09-22 Thread Deepak Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15512979#comment-15512979
 ] 

Deepak Narkhede commented on APEXCORE-528:
--

Hi Alex,
Are you currently working on it. If not, Please let me know I would like to 
take it up.

Thanks,
Deepak

> Output Ports Not Optional by Default During Validation
> --
>
> Key: APEXCORE-528
> URL: https://issues.apache.org/jira/browse/APEXCORE-528
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Alex McCullough
>Assignee: Alex McCullough
>Priority: Minor
>
> The 'optional' OutputPortFieldAnnotation states that the default value is 
> true. When you build a DAG with multiple output ports the validator throws an 
> error telling you at least one must be connected. To fix, you must explicitly 
> add the annotation to all output ports and set the value to True, which is 
> supposed to already be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: excluding hadoop dependencies from application package

2016-09-22 Thread Munagala Ramanath
Definitely worth adding.

Ram

On Wed, Sep 21, 2016 at 1:20 PM, Pramod Immaneni 
wrote:

> Candidate to be added here?
>
> https://apex.apache.org/docs/apex/development_best_practices/
>
> On Wed, Sep 21, 2016 at 12:24 PM, Munagala Ramanath 
> wrote:
>
> > Some info here:
> > http://docs.datatorrent.com/troubleshooting/#hadoop-
> dependencies-conflicts
> >
> > Ram
> >
> >
> > On Wed, Sep 21, 2016 at 12:00 PM, Vlad Rozov 
> > wrote:
> >
> > > Is subject already documented?
> > >
> > > Thank you,
> > >
> > > Vlad
> > >
> >
>


Trying to access live data from couchbase

2016-09-22 Thread Hitesh Goyal
Hi team,

Trying to access live data from couchbase using input Operators.
It is running fine on my local. The tuples are emitting as Pojo through the 
output port of inputOperator.
But when I upload the .apa file on data torrent, it gets launched but soon it 
goes into Failed State just after been accepted.
Can you please help me to resolve the issue ? I am sending logs as an attached 
file.

Regards,
Hitesh Goyal
Simpli5d Technologies
Cont No.: 9996588220

2016-09-22 12:40:30,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Updating application attempt appattempt_1474540741175_0001_01 with final 
state: FAILED, and exit status: 1
2016-09-22 12:40:30,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1474540741175_0001_01 State change from LAUNCHED to FINAL_SAVING
2016-09-22 12:40:30,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
Unregistering app attempt : appattempt_1474540741175_0001_01
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
Application finished, removing password for appattempt_1474540741175_0001_01
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1474540741175_0001_01 State change from FINAL_SAVING to FAILED
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of 
failed attempts is 1. The max attempts is 2
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Application Attempt appattempt_1474540741175_0001_01 is done. 
finalState=FAILED
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
Registering app attempt : appattempt_1474540741175_0001_02
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1474540741175_0001_02 State change from NEW to SUBMITTED
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: 
Application application_1474540741175_0001 requests cleared
2016-09-22 12:40:30,310 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
Application removed - appId: application_1474540741175_0001 user: dtadmin 
queue: default #user-pending-applications: 0 #user-active-applications: 0 
#queue-pending-applications: 0 #queue-active-applications: 0
2016-09-22 12:40:30,310 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
Application application_1474540741175_0001 from user: dtadmin activated in 
queue: default
2016-09-22 12:40:30,310 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
Application added - appId: application_1474540741175_0001 user: 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@77d0922b,
 leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 
#queue-pending-applications: 0 #queue-active-applications: 1
2016-09-22 12:40:30,310 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Added Application Attempt appattempt_1474540741175_0001_02 to scheduler 
from user dtadmin in queue default
2016-09-22 12:40:30,310 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1474540741175_0001_02 State change from SUBMITTED to SCHEDULED
2016-09-22 12:40:31,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Null container completed...
2016-09-22 12:40:31,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_1474540741175_0001_02_01 Container Transitioned from NEW to 
ALLOCATED
2016-09-22 12:40:31,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dtadmin  
OPERATION=AM Allocated ContainerTARGET=SchedulerApp RESULT=SUCCESS  
APPID=application_1474540741175_0001
CONTAINERID=container_1474540741175_0001_02_01
2016-09-22 12:40:31,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned 
container container_1474540741175_0001_02_01 of capacity  on host localhost.localdomain:35189, which has 1 containers, 
 used and  available after 
allocation
2016-09-22 12:40:31,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
assignedContainer application attempt=appattempt_1474540741175_0001_02 
container=Container: [ContainerId: container_1474540741175_0001_02_01, 
NodeId: localhost.localdomain:35189, NodeHttpAddress: 
localhost.localdomain:8042, Resource: , Priority: 0, 
Token: null, ] queue=default: capacity=1.0, absoluteCapacity=1.0, 
usedResources=, usedCapacity=0.0, absoluteUsedCapacity=0.0,

[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation

2016-09-22 Thread Alex McCullough (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513421#comment-15513421
 ] 

Alex McCullough commented on APEXCORE-528:
--

Hey Deepak, 

I am working on this. 

I did want to clarify the exact rules that should be in place though for the 
output ports. 

I see in the comments it states:

 // multiple ports w/o annotation, one of them must be connected

Is this truly the rule? My proposal would be for output ports, there is never a 
requirement to connect them unless the annotation explicitly marks it as 
optional = false.

Thanks,
Alex

> Output Ports Not Optional by Default During Validation
> --
>
> Key: APEXCORE-528
> URL: https://issues.apache.org/jira/browse/APEXCORE-528
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Alex McCullough
>Assignee: Alex McCullough
>Priority: Minor
>
> The 'optional' OutputPortFieldAnnotation states that the default value is 
> true. When you build a DAG with multiple output ports the validator throws an 
> error telling you at least one must be connected. To fix, you must explicitly 
> add the annotation to all output ports and set the value to True, which is 
> supposed to already be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation

2016-09-22 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513460#comment-15513460
 ] 

Thomas Weise commented on APEXCORE-528:
---

IMO output ports that don't have an annotation should be considered optional, 
that's also the default in the annotation.

> Output Ports Not Optional by Default During Validation
> --
>
> Key: APEXCORE-528
> URL: https://issues.apache.org/jira/browse/APEXCORE-528
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Alex McCullough
>Assignee: Alex McCullough
>Priority: Minor
>
> The 'optional' OutputPortFieldAnnotation states that the default value is 
> true. When you build a DAG with multiple output ports the validator throws an 
> error telling you at least one must be connected. To fix, you must explicitly 
> add the annotation to all output ports and set the value to True, which is 
> supposed to already be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation

2016-09-22 Thread Alex McCullough (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513507#comment-15513507
 ] 

Alex McCullough commented on APEXCORE-528:
--

Because when no annotations are added the OutputPortMeta.portAnnotation is 
null, it can't find the defaults to populate. 

Is there any issue with using reflection to populate defaults when no explicit 
annotation value is provided? Then you can just have standard logic and not 
worry about null objects and such.

> Output Ports Not Optional by Default During Validation
> --
>
> Key: APEXCORE-528
> URL: https://issues.apache.org/jira/browse/APEXCORE-528
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Alex McCullough
>Assignee: Alex McCullough
>Priority: Minor
>
> The 'optional' OutputPortFieldAnnotation states that the default value is 
> true. When you build a DAG with multiple output ports the validator throws an 
> error telling you at least one must be connected. To fix, you must explicitly 
> add the annotation to all output ports and set the value to True, which is 
> supposed to already be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation

2016-09-22 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513515#comment-15513515
 ] 

Thomas Weise commented on APEXCORE-528:
---

Can you point me to the exact place in the code you are referring to?

> Output Ports Not Optional by Default During Validation
> --
>
> Key: APEXCORE-528
> URL: https://issues.apache.org/jira/browse/APEXCORE-528
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Alex McCullough
>Assignee: Alex McCullough
>Priority: Minor
>
> The 'optional' OutputPortFieldAnnotation states that the default value is 
> true. When you build a DAG with multiple output ports the validator throws an 
> error telling you at least one must be connected. To fix, you must explicitly 
> add the annotation to all output ports and set the value to True, which is 
> supposed to already be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation

2016-09-22 Thread Alex McCullough (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513525#comment-15513525
 ] 

Alex McCullough commented on APEXCORE-528:
--

https://github.com/apache/apex-core/blob/master/engine/src/main/java/com/datatorrent/stram/plan/logical/LogicalPlan.java#L1748-L1773

> Output Ports Not Optional by Default During Validation
> --
>
> Key: APEXCORE-528
> URL: https://issues.apache.org/jira/browse/APEXCORE-528
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Alex McCullough
>Assignee: Alex McCullough
>Priority: Minor
>
> The 'optional' OutputPortFieldAnnotation states that the default value is 
> true. When you build a DAG with multiple output ports the validator throws an 
> error telling you at least one must be connected. To fix, you must explicitly 
> add the annotation to all output ports and set the value to True, which is 
> supposed to already be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXCORE-538) Log raw data when RPC message fails to de-serialize

2016-09-22 Thread Vlad Rozov (JIRA)
Vlad Rozov created APEXCORE-538:
---

 Summary: Log raw data when RPC message fails to de-serialize 
 Key: APEXCORE-538
 URL: https://issues.apache.org/jira/browse/APEXCORE-538
 Project: Apache Apex Core
  Issue Type: Improvement
Reporter: Vlad Rozov
Assignee: Vlad Rozov
 Fix For: 3.2.2, 3.3.1, 3.4.1, 3.5.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2261) Python binding for high level API

2016-09-22 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise updated APEXMALHAR-2261:
-
Labels: roadmap  (was: )

> Python binding for high level API
> -
>
> Key: APEXMALHAR-2261
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2261
> Project: Apache Apex Malhar
>  Issue Type: New Feature
>Reporter: Thomas Weise
>  Labels: roadmap
>
> A high level API similar to the Apex Java stream API that lets users specify 
> an application in Python.
> https://lists.apache.org/thread.html/9837b1dee8f909ed400c6030ce5c6a94a12f43183718019dd0bfd228@%3Cdev.apex.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Python support

2016-09-22 Thread Thomas Weise
I created the following JIRAs:

https://issues.apache.org/jira/browse/APEXMALHAR-2260
https://issues.apache.org/jira/browse/APEXMALHAR-2261


On Wed, Sep 21, 2016 at 11:10 PM, Chinmay Kolhatkar  wrote:

> I would like to help in contributing to this feature.
>
> On Wed, Sep 21, 2016 at 12:26 AM, Sasha Parfenov 
> wrote:
>
> > +1 on both executing Python code in an operator and high level API for
> > constructing Pipelines in Python.
> >
> > There is a large user base of engineers and data scientists which use
> > Python on regular basis for crunching through big data.  Providing them
> > with a powerful new platform for big data processing, wrapped in a
> familiar
> > language, will open Apex to a much broader user base and help grow the
> > project.
> >
> > Given the potentially new user base of Python developers, it may make
> sense
> > to prioritize the high level API for pipeline construction.  This will
> > allow users to build simple applications with existing library operators,
> > and we can get feedback on what areas they would like to see improved
> next
> > - custom Python operator support or more built-in library operators.
> >
> > Thanks,
> > Sasha
> >
> > On Thu, Sep 15, 2016 at 2:06 PM, Thomas Weise  wrote:
> >
> > > Hi,
> > >
> > > Python (not Jython) seems to be a popular language and frequently used
> > for
> > > data analysis, especially where flexibility matters. It has a
> > comprehensive
> > > library and it is generally considered low barrier to entry. I have
> also
> > > seen Python used in critical back-end components, although that's
> > probably
> > > not very common?
> > >
> > > I think Python support could potentially expand the user base for Apex.
> > > There are 2 main areas that can be considered:
> > >
> > > 1) Support to execute Python code through an operator
> > > 2) A client API that lets users construct pipelines in Python
> > >
> > > The former can exist without the latter. And it would enable users to
> > > leverage existing code that otherwise would have to be rewritten in a
> JVM
> > > language. The engine could ship scripts/packages so they are
> > automatically
> > > distributed on the cluster.
> > >
> > > A useful client API probably requires back-end support for lambda
> > functions
> > > and more complex UDFs.
> > >
> > > Would be great to get some feedback, especially from those that have
> > > experience with Python, on how an integration could potentially open up
> > new
> > > use cases for Apex.
> > >
> > > Thanks,
> > > Thomas
> > >
> >
>


[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation

2016-09-22 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513735#comment-15513735
 ] 

Thomas Weise commented on APEXCORE-528:
---

Isn't it just a matter of removing the allPortsOptional check?

> Output Ports Not Optional by Default During Validation
> --
>
> Key: APEXCORE-528
> URL: https://issues.apache.org/jira/browse/APEXCORE-528
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Alex McCullough
>Assignee: Alex McCullough
>Priority: Minor
>
> The 'optional' OutputPortFieldAnnotation states that the default value is 
> true. When you build a DAG with multiple output ports the validator throws an 
> error telling you at least one must be connected. To fix, you must explicitly 
> add the annotation to all output ports and set the value to True, which is 
> supposed to already be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2261) Python binding for high level API

2016-09-22 Thread Thomas Weise (JIRA)
Thomas Weise created APEXMALHAR-2261:


 Summary: Python binding for high level API
 Key: APEXMALHAR-2261
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2261
 Project: Apache Apex Malhar
  Issue Type: New Feature
Reporter: Thomas Weise


A high level API similar to the Apex Java stream API that lets users specify an 
application in Python.

https://lists.apache.org/thread.html/9837b1dee8f909ed400c6030ce5c6a94a12f43183718019dd0bfd228@%3Cdev.apex.apache.org%3E




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2130) implement scalable windowed storage

2016-09-22 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise updated APEXMALHAR-2130:
-
Labels: roadmap  (was: )

> implement scalable windowed storage
> ---
>
> Key: APEXMALHAR-2130
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2130
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: David Yan
>  Labels: roadmap
>
> This feature is used for supporting windowing.
> The storage needs to have the following features:
> 1. Spillable key value storage (integrate with APEXMALHAR-2026)
> 2. Upon checkpoint, it saves a snapshot for the entire data set with the 
> checkpointing window id.  This should be done incrementally (ManagedState) to 
> avoid wasting space with unchanged data
> 3. When recovering, it takes the recovery window id and restores to that 
> snapshot
> 4. When a window is committed, all windows with a lower ID should be purged 
> from the store.
> 5. It should implement the WindowedStorage and WindowedKeyedStorage 
> interfaces, and because of 2 and 3, we may want to add methods to the 
> WindowedStorage interface so that the implementation of WindowedOperator can 
> notify the storage of checkpointing, recovering and committing of a window.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2130) Scalable windowed storage

2016-09-22 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise updated APEXMALHAR-2130:
-
Summary: Scalable windowed storage  (was: implement scalable windowed 
storage)

> Scalable windowed storage
> -
>
> Key: APEXMALHAR-2130
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2130
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: David Yan
>  Labels: roadmap
>
> This feature is used for supporting windowing.
> The storage needs to have the following features:
> 1. Spillable key value storage (integrate with APEXMALHAR-2026)
> 2. Upon checkpoint, it saves a snapshot for the entire data set with the 
> checkpointing window id.  This should be done incrementally (ManagedState) to 
> avoid wasting space with unchanged data
> 3. When recovering, it takes the recovery window id and restores to that 
> snapshot
> 4. When a window is committed, all windows with a lower ID should be purged 
> from the store.
> 5. It should implement the WindowedStorage and WindowedKeyedStorage 
> interfaces, and because of 2 and 3, we may want to add methods to the 
> WindowedStorage interface so that the implementation of WindowedOperator can 
> notify the storage of checkpointing, recovering and committing of a window.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2260) Python execution for operator logic

2016-09-22 Thread Thomas Weise (JIRA)
Thomas Weise created APEXMALHAR-2260:


 Summary: Python execution for operator logic 
 Key: APEXMALHAR-2260
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2260
 Project: Apache Apex Malhar
  Issue Type: New Feature
Reporter: Thomas Weise


Support execution of Python code in an operator. 

https://lists.apache.org/thread.html/9837b1dee8f909ed400c6030ce5c6a94a12f43183718019dd0bfd228@%3Cdev.apex.apache.org%3E




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2244) Optimize WindowedStorage and Spillable data structures for time series

2016-09-22 Thread Siyuan Hua (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513770#comment-15513770
 ] 

Siyuan Hua commented on APEXMALHAR-2244:


Each spillable DS implementation use a SpillableStateStore to store things and 
we can make ManagedTimeUnifiedStateImpl implement the store as well and it can 
take some time extract function to get/calculate time and time buckets from 
each V/KV data.  And the Store can be setup by the WindowedOperator, correct? 

> Optimize WindowedStorage and Spillable data structures for time series
> --
>
> Key: APEXMALHAR-2244
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2244
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>Assignee: Siyuan Hua
>
> The spillable data structures currently does not make any assumption about 
> the key that is used in Managed State, and as a result, it uses 
> ManagedStateImpl to interface with Managed State and uses time buckets that 
> are based on the apex window id. But for WindowedStorage used by 
> WindowedOperator, the key to the storage is a window, which is event time 
> based. Using the default ManagedStateImpl would be very inefficient for event 
> time based keys, since it would write data that would belong to the same 
> window to different time buckets.
> On a high level, the below summarizes roughly what needs to be done:
> 1. a way to tell the spillable data structures to use the 
> ManagedTimeUnifiedStateImpl
> 2. a way to tell the spillable data structures how to extract the timestamp 
> from the key. Note that in the case of WindowedOperator, the timestamp should 
> be the end timestamp of the window (beginTimeMillis + durationMillis), not 
> the begin timestamp.
> 3. a way to tell the spillable data structures how to assign the time bucket 
> given that timestamp
> 4. with point 3, the spillable implementations of WindowedStorage will need 
> to take a config parameter that says how much time (in millis) is each time 
> bucket
> 5. only purge a time bucket when all keys that belong to that time bucket are 
> removed and the apex window id of the first window in which the keys are all 
> removed has been committed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-core pull request #396: APEXCORE-538 - Log raw data when RPC message fa...

2016-09-22 Thread vrozov
GitHub user vrozov opened a pull request:

https://github.com/apache/apex-core/pull/396

APEXCORE-538 - Log raw data when RPC message fails to de-serialize

@tweise Please review and cherry-pick. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vrozov/apex-core APEXCORE-538

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-core/pull/396.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #396


commit e0ee2d105ad4091eee2e14cf4c519bdb732734d8
Author: Vlad Rozov 
Date:   2016-09-22T16:31:32Z

APEXCORE-538 - Log raw data when RPC message fails to de-serialize




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXCORE-538) Log raw data when RPC message fails to de-serialize

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513785#comment-15513785
 ] 

ASF GitHub Bot commented on APEXCORE-538:
-

GitHub user vrozov opened a pull request:

https://github.com/apache/apex-core/pull/396

APEXCORE-538 - Log raw data when RPC message fails to de-serialize

@tweise Please review and cherry-pick. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vrozov/apex-core APEXCORE-538

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-core/pull/396.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #396


commit e0ee2d105ad4091eee2e14cf4c519bdb732734d8
Author: Vlad Rozov 
Date:   2016-09-22T16:31:32Z

APEXCORE-538 - Log raw data when RPC message fails to de-serialize




> Log raw data when RPC message fails to de-serialize 
> 
>
> Key: APEXCORE-538
> URL: https://issues.apache.org/jira/browse/APEXCORE-538
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
> Fix For: 3.3.1, 3.2.2, 3.5.0, 3.4.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #408: APEXMALHAR-2248 Added SpillableSet and Spilla...

2016-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/408


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2248) Create SpillableSet and SpillableSetMultimap interfaces and implementation

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513787#comment-15513787
 ] 

ASF GitHub Bot commented on APEXMALHAR-2248:


Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/408


> Create SpillableSet and SpillableSetMultimap interfaces and implementation
> --
>
> Key: APEXMALHAR-2248
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2248
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>Assignee: David Yan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2248) Create SpillableSet and SpillableSetMultimap interfaces and implementation

2016-09-22 Thread Siyuan Hua (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyuan Hua updated APEXMALHAR-2248:
---
Fix Version/s: 3.6.0

> Create SpillableSet and SpillableSetMultimap interfaces and implementation
> --
>
> Key: APEXMALHAR-2248
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2248
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>Assignee: David Yan
> Fix For: 3.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2248) Create SpillableSet and SpillableSetMultimap interfaces and implementation

2016-09-22 Thread Siyuan Hua (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyuan Hua resolved APEXMALHAR-2248.

Resolution: Fixed

> Create SpillableSet and SpillableSetMultimap interfaces and implementation
> --
>
> Key: APEXMALHAR-2248
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2248
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>Assignee: David Yan
> Fix For: 3.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #404: APEXMALHAR-2190 #resolve #comment Use reusabl...

2016-09-22 Thread brightchen
Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/404


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-malhar pull request #404: APEXMALHAR-2190 #resolve #comment Use reusabl...

2016-09-22 Thread brightchen
GitHub user brightchen reopened a pull request:

https://github.com/apache/apex-malhar/pull/404

APEXMALHAR-2190 #resolve #comment Use reusable buffer to serial spill…

…able data structure

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2190-PR

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/404.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #404


commit b17ca37cf7f0ae8037ce69206e9ae914f8168d33
Author: brightchen 
Date:   2016-08-16T00:46:27Z

APEXMALHAR-2190 #resolve #comment Use reusable buffer to serial spillable 
data structure




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2190) Use reusable buffer to serial spillable data structure

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513929#comment-15513929
 ] 

ASF GitHub Bot commented on APEXMALHAR-2190:


Github user brightchen closed the pull request at:

https://github.com/apache/apex-malhar/pull/404


> Use reusable buffer to serial spillable data structure
> --
>
> Key: APEXMALHAR-2190
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2190
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: bright chen
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> Spillable Data Structure created lots of temporary memory to serial data lot 
> of of memory copy( see SliceUtils.concatenate(byte[], byte[]). Which used up 
> memory very quickly. See APEXMALHAR-2182.
> Use a shared memory to avoid allocate temporary memory and memory copy
> some basic ideas
> - SerToLVBuffer interface provides a method serTo(T object, LengthValueBuffer 
> buffer): instead of create a memory and then return the serialized data, this 
> method let the caller pass in the buffer. So different objects or object with 
> embed objects can share the same LengthValueBuffer
> - LengthValueBuffer: It is a buffer which manage the memory as length and 
> value(which is the generic format of serialized data). which provide length 
> placeholder mechanism to avoid temporary memory and data copy when the length 
> can be know after data serialized
> - memory management classes: includes interface ByteStream and it's 
> implementations: Block, FixedBlock, BlocksStream. Which provides a mechanism 
> to dynamic allocate and manage memory. Which basically provides following 
> function. I tried other some other stream mechamism such as 
> ByteArrayInputStream, but it can meet 3rd criteria, and don't have good 
> performance(50% loss) 
>   - dynamic allocate memory
>   - reset memory for reuse
>   - BlocksStream make sure the output slices will not be changed when need 
> extra memory; Block can change the reference of output slices buffer is data 
> was moved due to reallocate of memory(BlocksStream is better solution).
>   - WindowableBlocksStream extends from BlocksStream and provides function to 
> reset memory window by window instead of reset all memory. It provides 
> certain amount of cache( as bytes ) in memory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2190) Use reusable buffer to serial spillable data structure

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513930#comment-15513930
 ] 

ASF GitHub Bot commented on APEXMALHAR-2190:


GitHub user brightchen reopened a pull request:

https://github.com/apache/apex-malhar/pull/404

APEXMALHAR-2190 #resolve #comment Use reusable buffer to serial spill…

…able data structure

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2190-PR

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/404.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #404


commit b17ca37cf7f0ae8037ce69206e9ae914f8168d33
Author: brightchen 
Date:   2016-08-16T00:46:27Z

APEXMALHAR-2190 #resolve #comment Use reusable buffer to serial spillable 
data structure




> Use reusable buffer to serial spillable data structure
> --
>
> Key: APEXMALHAR-2190
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2190
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: bright chen
>   Original Estimate: 240h
>  Remaining Estimate: 240h
>
> Spillable Data Structure created lots of temporary memory to serial data lot 
> of of memory copy( see SliceUtils.concatenate(byte[], byte[]). Which used up 
> memory very quickly. See APEXMALHAR-2182.
> Use a shared memory to avoid allocate temporary memory and memory copy
> some basic ideas
> - SerToLVBuffer interface provides a method serTo(T object, LengthValueBuffer 
> buffer): instead of create a memory and then return the serialized data, this 
> method let the caller pass in the buffer. So different objects or object with 
> embed objects can share the same LengthValueBuffer
> - LengthValueBuffer: It is a buffer which manage the memory as length and 
> value(which is the generic format of serialized data). which provide length 
> placeholder mechanism to avoid temporary memory and data copy when the length 
> can be know after data serialized
> - memory management classes: includes interface ByteStream and it's 
> implementations: Block, FixedBlock, BlocksStream. Which provides a mechanism 
> to dynamic allocate and manage memory. Which basically provides following 
> function. I tried other some other stream mechamism such as 
> ByteArrayInputStream, but it can meet 3rd criteria, and don't have good 
> performance(50% loss) 
>   - dynamic allocate memory
>   - reset memory for reuse
>   - BlocksStream make sure the output slices will not be changed when need 
> extra memory; Block can change the reference of output slices buffer is data 
> was moved due to reallocate of memory(BlocksStream is better solution).
>   - WindowableBlocksStream extends from BlocksStream and provides function to 
> reset memory window by window instead of reset all memory. It provides 
> certain amount of cache( as bytes ) in memory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXCORE-539) DefaultAttributeMap is not thread safe

2016-09-22 Thread Vlad Rozov (JIRA)
Vlad Rozov created APEXCORE-539:
---

 Summary: DefaultAttributeMap is not thread safe
 Key: APEXCORE-539
 URL: https://issues.apache.org/jira/browse/APEXCORE-539
 Project: Apache Apex Core
  Issue Type: Bug
Reporter: Vlad Rozov


DefaultAttributeMap put() and get() methods may be called from different 
threads and current implementation that uses HashMap is not thread safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-539) DefaultAttributeMap is not thread safe

2016-09-22 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513980#comment-15513980
 ] 

Thomas Weise commented on APEXCORE-539:
---

It wasn't the original intention to modify the map once it is shared between 
threads. Where is it happening?

> DefaultAttributeMap is not thread safe
> --
>
> Key: APEXCORE-539
> URL: https://issues.apache.org/jira/browse/APEXCORE-539
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Vlad Rozov
>
> DefaultAttributeMap put() and get() methods may be called from different 
> threads and current implementation that uses HashMap is not thread safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-539) DefaultAttributeMap is not thread safe

2016-09-22 Thread Vlad Rozov (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514021#comment-15514021
 ] 

Vlad Rozov commented on APEXCORE-539:
-

I see the following exception after I changed DefaultAttributeMap to store and 
compare thread with the current thread:
{noformat}
2016-09-22 10:36:46,849 [container-0] ERROR stram.StramLocalCluster run - 
Container container-0 failed
java.util.ConcurrentModificationException: current thread 
Thread[container-0,5,main] existing thread Thread[main,5,main]
at 
com.datatorrent.api.Attribute$AttributeMap$DefaultAttributeMap.put(Attribute.java:209)
at 
com.datatorrent.stram.engine.StreamingContainer.setup(StreamingContainer.java:160)
at 
com.datatorrent.stram.StramLocalCluster$LocalStreamingContainer.run(StramLocalCluster.java:178)
at 
com.datatorrent.stram.StramLocalCluster$LocalStramChildLauncher.run(StramLocalCluster.java:269)
at java.lang.Thread.run(Thread.java:745)
{noformat}

Why not to use ConcurrentHashMap instead of HashMap?

> DefaultAttributeMap is not thread safe
> --
>
> Key: APEXCORE-539
> URL: https://issues.apache.org/jira/browse/APEXCORE-539
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Vlad Rozov
>
> DefaultAttributeMap put() and get() methods may be called from different 
> threads and current implementation that uses HashMap is not thread safe.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-532) New dynamically added operator does not start with correct windowId.

2016-09-22 Thread Pramod Immaneni (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514071#comment-15514071
 ] 

Pramod Immaneni commented on APEXCORE-532:
--

[~tushargosavi] checking why this is broken when it used to work before.

> New dynamically added operator does not start with correct windowId.
> 
>
> Key: APEXCORE-532
> URL: https://issues.apache.org/jira/browse/APEXCORE-532
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Tushar Gosavi
>Priority: Critical
>
> During dynamic DAG change, If new operator is added and connected to existing 
> operator, it does not starts with correct windowId. The baseSeconds is set to 
> 0 causing windowId management problems at master effectively halting purge 
> from buffer server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2262) lock on is too AbstractManagedStateImpl.getValueFromBucketSync wide

2016-09-22 Thread bright chen (JIRA)
bright chen created APEXMALHAR-2262:
---

 Summary: lock on is too 
AbstractManagedStateImpl.getValueFromBucketSync wide
 Key: APEXMALHAR-2262
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2262
 Project: Apache Apex Malhar
  Issue Type: Improvement
Reporter: bright chen
Assignee: bright chen


The Managed State used a lot of lock, which could impact a lot on performance. 
AbstractManagedStateImpl.getValueFromBucketSync(long, long, Slice) lock the 
buck to get value, But if the value still in memory, the lock is not necessary 
as flash is ConcurrentMap.

probably AbstractManagedStateImpl should only lock when add/remove bucket. And 
bucket handle read/write lock inside bucket
  - 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-532) New dynamically added operator does not start with correct windowId.

2016-09-22 Thread Tushar Gosavi (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514117#comment-15514117
 ] 

Tushar Gosavi commented on APEXCORE-532:




If newly added operator is attached to some operator which was already exists 
in the DAG,
- It starts from activation checkpoint
- bufferserver subscriber use activation checkpoint id to set the initial 
baseSeconds (setup of BufferServerSubscriber).
- when new data is received, we get partital windowId form tuple, or it with 
baseSeconds and
  construct full windowId.

baseSeconds is expected to set to correct value if operator starts from known 
checkpoint,
during initial dag deployment input operators sends resetWindow tuple, which 
sets baseSeconds
for all downstream operators. In this case operator was added in the end and 
was started
from activation checkpoint. but no resetWindow tuple was received to set its 
baseSeconds to
correct value.

In this case the starting checkpoint is different than windowId, may be we 
should use two fields in
deploy info
- one for checkpoint from where to load the state
- starting baseSeconds which is taken from existing upstream to correctly set 
the windowIds in absense of resetWindow tuple.


> New dynamically added operator does not start with correct windowId.
> 
>
> Key: APEXCORE-532
> URL: https://issues.apache.org/jira/browse/APEXCORE-532
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Tushar Gosavi
>Priority: Critical
>
> During dynamic DAG change, If new operator is added and connected to existing 
> operator, it does not starts with correct windowId. The baseSeconds is set to 
> 0 causing windowId management problems at master effectively halting purge 
> from buffer server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2262) lock on AbstractManagedStateImpl.getValueFromBucketSync is too wide

2016-09-22 Thread bright chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bright chen updated APEXMALHAR-2262:

Summary: lock on AbstractManagedStateImpl.getValueFromBucketSync is too 
wide  (was: lock on is too AbstractManagedStateImpl.getValueFromBucketSync wide)

> lock on AbstractManagedStateImpl.getValueFromBucketSync is too wide
> ---
>
> Key: APEXMALHAR-2262
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2262
> Project: Apache Apex Malhar
>  Issue Type: Improvement
>Reporter: bright chen
>Assignee: bright chen
>
> The Managed State used a lot of lock, which could impact a lot on 
> performance. 
> AbstractManagedStateImpl.getValueFromBucketSync(long, long, Slice) lock the 
> buck to get value, But if the value still in memory, the lock is not 
> necessary as flash is ConcurrentMap.
> probably AbstractManagedStateImpl should only lock when add/remove bucket. 
> And bucket handle read/write lock inside bucket
>   - 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2263) Offsets in AbstractFileInputOperator should be long rather than int

2016-09-22 Thread Munagala V. Ramanath (JIRA)
Munagala V. Ramanath created APEXMALHAR-2263:


 Summary: Offsets in AbstractFileInputOperator should be long 
rather than int
 Key: APEXMALHAR-2263
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2263
 Project: Apache Apex Malhar
  Issue Type: Bug
  Components: adapters other
Reporter: Munagala V. Ramanath


Offsets in AbstractFileInputOperator use the int type which means files with 
more that (2**31 -1) records will cause overflows and mysterious failures.

Should be changed to use long.
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Trying to access live data from couchbase

2016-09-22 Thread Vlad Rozov

Check Application master logs and stderr output.

Thank you,

Vlad

On 9/22/16 04:09, Hitesh Goyal wrote:


Hi team,

Trying to access live data from couchbase using input Operators.

It is running fine on my local. The tuples are emitting as Pojo 
through the output port of inputOperator.


But when I upload the .apa file on data torrent, it gets launched but 
soon it goes into Failed State just after been accepted.


Can you please help me to resolve the issue ? I am sending logs as an 
attached file.


Regards,

*Hitesh Goyal*

Simpli5d Technologies

Cont No.: 9996588220





[GitHub] apex-malhar pull request #399: REVIEW ONLY: exposing seeking and iterating o...

2016-09-22 Thread davidyan74
Github user davidyan74 closed the pull request at:

https://github.com/apache/apex-malhar/pull/399


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] apex-malhar pull request #377: *Review Only* APEXMALHAR-2192 Added some spil...

2016-09-22 Thread davidyan74
Github user davidyan74 closed the pull request at:

https://github.com/apache/apex-malhar/pull/377


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2192) Implement SpillableByteArrayListMultimapImpl.removeAll(Object key)

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514345#comment-15514345
 ] 

ASF GitHub Bot commented on APEXMALHAR-2192:


Github user davidyan74 closed the pull request at:

https://github.com/apache/apex-malhar/pull/377


> Implement SpillableByteArrayListMultimapImpl.removeAll(Object key)
> --
>
> Key: APEXMALHAR-2192
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2192
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>
> This is needed by the spillable implementation of 
> SpillableWindowedKeyedStorage. In the implementation of WindowedOperator, 
> when purging a window, we need to purge all keys associated with a given 
> window.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (APEXMALHAR-2191) Implement a way to get all keys with or without prefix from SpillableByteMapImpl

2016-09-22 Thread David Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan closed APEXMALHAR-2191.
-

> Implement a way to get all keys with or without prefix from 
> SpillableByteMapImpl
> 
>
> Key: APEXMALHAR-2191
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2191
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>Assignee: David Yan
>Priority: Critical
>
> WindowedKeyedStorage is basically a Map>, and we need the 
> capability of getting all K's given a window, and getting V given a window 
> and K.
> Currently, the spillable implementation of WindowedKeyedStorage uses two 
> spillable data structures -- SpillableByteMapImpl, V> and 
> SpillableArrayListMultimapImpl.  This will not work because we 
> need to be able to remove a key from a window, which SpillableArrayList does 
> not support, and having two separate spillable data structures if it can be 
> achieved by just one should be avoided in general. 
> We will implement the solution that supports prefix scanning (given a window, 
> return all keys), which requires the keys to be stored in order in managed 
> state.
> Traversing all keys in the spillable map is needed by WindowStateMap. The 
> WindowedOperator needs a way to traverse all windows in its state when firing 
> a trigger. However, this is less urgent since the Window meta info is small 
> and should fit in memory even with millions of windows. This is more for the 
> checkpointing efficiency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (APEXMALHAR-2192) Implement SpillableByteArrayListMultimapImpl.removeAll(Object key)

2016-09-22 Thread David Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan closed APEXMALHAR-2192.
-

> Implement SpillableByteArrayListMultimapImpl.removeAll(Object key)
> --
>
> Key: APEXMALHAR-2192
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2192
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>
> This is needed by the spillable implementation of 
> SpillableWindowedKeyedStorage. In the implementation of WindowedOperator, 
> when purging a window, we need to purge all keys associated with a given 
> window.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (APEXMALHAR-2189) Implement SpillableArrayListImpl.iterator

2016-09-22 Thread David Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan closed APEXMALHAR-2189.
-

> Implement SpillableArrayListImpl.iterator
> -
>
> Key: APEXMALHAR-2189
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2189
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>
> This is needed by the spiillable storage implementation of WindowedOperator 
> so that it can iterate through all the keys in the window to determine 
> whether or not to fire a trigger.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (APEXMALHAR-2188) Implement SpillableByteArrayListMultimapImpl.containsEntry

2016-09-22 Thread David Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan closed APEXMALHAR-2188.
-

> Implement SpillableByteArrayListMultimapImpl.containsEntry
> --
>
> Key: APEXMALHAR-2188
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2188
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>
> This method is needed by SpillableWindowedKeyedStorage.put(). It needs to 
> find out whether the key already exists in the window before pushing to the 
> Spillable ArrayList. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2191) Implement a way to get all keys with or without prefix from SpillableByteMapImpl

2016-09-22 Thread David Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated APEXMALHAR-2191:
--
Assignee: (was: David Yan)

> Implement a way to get all keys with or without prefix from 
> SpillableByteMapImpl
> 
>
> Key: APEXMALHAR-2191
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2191
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>Priority: Critical
>
> WindowedKeyedStorage is basically a Map>, and we need the 
> capability of getting all K's given a window, and getting V given a window 
> and K.
> Currently, the spillable implementation of WindowedKeyedStorage uses two 
> spillable data structures -- SpillableByteMapImpl, V> and 
> SpillableArrayListMultimapImpl.  This will not work because we 
> need to be able to remove a key from a window, which SpillableArrayList does 
> not support, and having two separate spillable data structures if it can be 
> achieved by just one should be avoided in general. 
> We will implement the solution that supports prefix scanning (given a window, 
> return all keys), which requires the keys to be stored in order in managed 
> state.
> Traversing all keys in the spillable map is needed by WindowStateMap. The 
> WindowedOperator needs a way to traverse all windows in its state when firing 
> a trigger. However, this is less urgent since the Window meta info is small 
> and should fit in memory even with millions of windows. This is more for the 
> checkpointing efficiency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXMALHAR-2249) Create SpillableSet and SpillableSetMultimap interfaces and implementation

2016-09-22 Thread David Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan updated APEXMALHAR-2249:
--
Assignee: (was: David Yan)

> Create SpillableSet and SpillableSetMultimap interfaces and implementation
> --
>
> Key: APEXMALHAR-2249
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2249
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (APEXMALHAR-2193) Implement SpillableByteArrayListMultimapImpl.remove(@Nullable Object key, @Nullable Object value)

2016-09-22 Thread David Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan closed APEXMALHAR-2193.
-

> Implement SpillableByteArrayListMultimapImpl.remove(@Nullable Object key, 
> @Nullable Object value)
> -
>
> Key: APEXMALHAR-2193
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2193
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>
> This is needed by SpillableSessionWindowedStorage. It needs a way to remove 
> the session window given the key in its internal keyToWindowsMap



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (APEXMALHAR-2231) Implement a Spillable map that takes two keys

2016-09-22 Thread David Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Yan closed APEXMALHAR-2231.
-
Assignee: (was: David Yan)

> Implement a Spillable map that takes two keys
> -
>
> Key: APEXMALHAR-2231
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2231
> Project: Apache Apex Malhar
>  Issue Type: Sub-task
>Reporter: David Yan
>
> This is similar to Map> with the ability to get all K2's given 
> a K1, and remove all entries with a given K1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #423: APEXMALHAR-2241 - The metadata kafka consumer...

2016-09-22 Thread venkateshkottapalli
GitHub user venkateshkottapalli opened a pull request:

https://github.com/apache/apex-malhar/pull/423

APEXMALHAR-2241 - The metadata kafka consumer should also pickup the 
properties setting on the kafka input operator

Fixes two issues - 
* metadata kafka consumer not picking up properties set 
* Consumer properties set from Properties.xml are  not getting picked.
* Added test case to validate the consumer Properties are not reset.

@siyuanh Please review.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/venkateshkottapalli/apex-malhar 
APEXMALHAR-2241-KafkaConsumer-MetaDataProps

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/423.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #423


commit 02c040ac184396602116f6a1eec26b0c838a6d23
Author: venkateshDT 
Date:   2016-09-22T20:27:32Z

fixes two issues - pick the metadata consumer properties and setting the 
consumer properties from Properties.xml for kafka inputoperator not working

added test case

Removed trailing spaces

Removed unused imports




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2241) The metadata kafka consumer should also pickup the properties setting on the kafka input operator.

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514640#comment-15514640
 ] 

ASF GitHub Bot commented on APEXMALHAR-2241:


GitHub user venkateshkottapalli opened a pull request:

https://github.com/apache/apex-malhar/pull/423

APEXMALHAR-2241 - The metadata kafka consumer should also pickup the 
properties setting on the kafka input operator

Fixes two issues - 
* metadata kafka consumer not picking up properties set 
* Consumer properties set from Properties.xml are  not getting picked.
* Added test case to validate the consumer Properties are not reset.

@siyuanh Please review.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/venkateshkottapalli/apex-malhar 
APEXMALHAR-2241-KafkaConsumer-MetaDataProps

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/423.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #423


commit 02c040ac184396602116f6a1eec26b0c838a6d23
Author: venkateshDT 
Date:   2016-09-22T20:27:32Z

fixes two issues - pick the metadata consumer properties and setting the 
consumer properties from Properties.xml for kafka inputoperator not working

added test case

Removed trailing spaces

Removed unused imports




> The metadata kafka consumer should also pickup the properties setting on the 
> kafka input operator. 
> ---
>
> Key: APEXMALHAR-2241
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2241
> Project: Apache Apex Malhar
>  Issue Type: Bug
>Reporter: Siyuan Hua
>Assignee: Venkatesh Kottapalli
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #345: APEXMALHAR-2130 REVIEW ONLY (WindowedOperator...

2016-09-22 Thread davidyan74
Github user davidyan74 closed the pull request at:

https://github.com/apache/apex-malhar/pull/345


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2130) Scalable windowed storage

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514686#comment-15514686
 ] 

ASF GitHub Bot commented on APEXMALHAR-2130:


Github user davidyan74 closed the pull request at:

https://github.com/apache/apex-malhar/pull/345


> Scalable windowed storage
> -
>
> Key: APEXMALHAR-2130
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2130
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: David Yan
>  Labels: roadmap
>
> This feature is used for supporting windowing.
> The storage needs to have the following features:
> 1. Spillable key value storage (integrate with APEXMALHAR-2026)
> 2. Upon checkpoint, it saves a snapshot for the entire data set with the 
> checkpointing window id.  This should be done incrementally (ManagedState) to 
> avoid wasting space with unchanged data
> 3. When recovering, it takes the recovery window id and restores to that 
> snapshot
> 4. When a window is committed, all windows with a lower ID should be purged 
> from the store.
> 5. It should implement the WindowedStorage and WindowedKeyedStorage 
> interfaces, and because of 2 and 3, we may want to add methods to the 
> WindowedStorage interface so that the implementation of WindowedOperator can 
> notify the storage of checkpointing, recovering and committing of a window.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #424: APEXMALHAR-2130 Spillable implementation for ...

2016-09-22 Thread davidyan74
GitHub user davidyan74 opened a pull request:

https://github.com/apache/apex-malhar/pull/424

APEXMALHAR-2130 Spillable implementation for WindowedOperator

@tweise @brightchen @siyuanh Please review and merge

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/davidyan74/apex-malhar windowedSpillable-PR

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/424.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #424


commit ede7f056919ba11c607b06d302153a7498e5e3e6
Author: David Yan 
Date:   2016-08-15T21:19:08Z

APEXMALHAR-2130 Spillable implementation for WindowedOperator




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2130) Scalable windowed storage

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514690#comment-15514690
 ] 

ASF GitHub Bot commented on APEXMALHAR-2130:


GitHub user davidyan74 opened a pull request:

https://github.com/apache/apex-malhar/pull/424

APEXMALHAR-2130 Spillable implementation for WindowedOperator

@tweise @brightchen @siyuanh Please review and merge

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/davidyan74/apex-malhar windowedSpillable-PR

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/424.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #424


commit ede7f056919ba11c607b06d302153a7498e5e3e6
Author: David Yan 
Date:   2016-08-15T21:19:08Z

APEXMALHAR-2130 Spillable implementation for WindowedOperator




> Scalable windowed storage
> -
>
> Key: APEXMALHAR-2130
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2130
> Project: Apache Apex Malhar
>  Issue Type: Task
>Reporter: bright chen
>Assignee: David Yan
>  Labels: roadmap
>
> This feature is used for supporting windowing.
> The storage needs to have the following features:
> 1. Spillable key value storage (integrate with APEXMALHAR-2026)
> 2. Upon checkpoint, it saves a snapshot for the entire data set with the 
> checkpointing window id.  This should be done incrementally (ManagedState) to 
> avoid wasting space with unchanged data
> 3. When recovering, it takes the recovery window id and restores to that 
> snapshot
> 4. When a window is committed, all windows with a lower ID should be purged 
> from the store.
> 5. It should implement the WindowedStorage and WindowedKeyedStorage 
> interfaces, and because of 2 and 3, we may want to add methods to the 
> WindowedStorage interface so that the implementation of WindowedOperator can 
> notify the storage of checkpointing, recovering and committing of a window.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #425: add jms input operator doc to malhar docs

2016-09-22 Thread sanjaypujare
GitHub user sanjaypujare opened a pull request:

https://github.com/apache/apex-malhar/pull/425

add jms input operator doc to malhar docs

@amberarrow  closing PR https://github.com/DataTorrent/docs/pull/81 and 
opening this one. Contents are the same.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sanjaypujare/apex-malhar 
jms-input-operator-doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/425.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #425


commit 97a5e62bd963d7bc45dddf93c982073e675f6ef4
Author: Sanjay Pujare 
Date:   2016-09-22T23:04:42Z

add jms input operator doc to malhar docs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation

2016-09-22 Thread Alex McCullough (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514909#comment-15514909
 ] 

Alex McCullough commented on APEXCORE-528:
--

Yep, that works. My thought was that if the default value ever changed you 
wouldn't want to have to update this code too, but I suppose the reality of it 
is that's not likely to happen.

> Output Ports Not Optional by Default During Validation
> --
>
> Key: APEXCORE-528
> URL: https://issues.apache.org/jira/browse/APEXCORE-528
> Project: Apache Apex Core
>  Issue Type: Bug
>Affects Versions: 3.4.0
>Reporter: Alex McCullough
>Assignee: Alex McCullough
>Priority: Minor
>
> The 'optional' OutputPortFieldAnnotation states that the default value is 
> true. When you build a DAG with multiple output ports the validator throws an 
> error telling you at least one must be connected. To fix, you must explicitly 
> add the annotation to all output ports and set the value to True, which is 
> supposed to already be the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-1818) Integrate Calcite to support SQL

2016-09-22 Thread Julian Hyde (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514919#comment-15514919
 ] 

Julian Hyde commented on APEXMALHAR-1818:
-

[~chinmay], I watched the demo video and took a look at your code, and it all 
looks fine. I suspect that you could get semantic stuff like view substitution 
with a low amount of effort.

I know you're planning inner join (I presume stream-to-stream join?). It will 
be interesting when you also support streaming GROUP BY and streaming windowed 
aggregate functions.

I started work on a streaming TCK in Calcite; see CALCITE-1114. You could 
potentially use that to verify that your implementation of streaming SQL is 
correct.

> Integrate Calcite to support SQL
> 
>
> Key: APEXMALHAR-1818
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-1818
> Project: Apache Apex Malhar
>  Issue Type: New Feature
>  Components: query operators
>Reporter: Amol
>Assignee: Chinmay Kolhatkar
>  Labels: roadmap
>
> Once we have ability to generate a subdag, we should take a look at 
> integrating Calcite into Apex. The operator that enables populate DAG, should 
> use Calcite to generate the DAG, given a SQL query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2264) Add documentation for jmsInputOperator

2016-09-22 Thread Sanjay M Pujare (JIRA)
Sanjay M Pujare created APEXMALHAR-2264:
---

 Summary: Add documentation for jmsInputOperator
 Key: APEXMALHAR-2264
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2264
 Project: Apache Apex Malhar
  Issue Type: Documentation
  Components: documentation
Reporter: Sanjay M Pujare
Assignee: Sanjay M Pujare
Priority: Minor


Add jmsInputOperator based on the latest enhancements made to that operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #425: APEXMALHAR-2264 add jms input operator doc to...

2016-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/425


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2264) Add documentation for jmsInputOperator

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515182#comment-15515182
 ] 

ASF GitHub Bot commented on APEXMALHAR-2264:


Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/425


> Add documentation for jmsInputOperator
> --
>
> Key: APEXMALHAR-2264
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2264
> Project: Apache Apex Malhar
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Sanjay M Pujare
>Assignee: Sanjay M Pujare
>Priority: Minor
>
> Add jmsInputOperator based on the latest enhancements made to that operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2264) Add documentation for jmsInputOperator

2016-09-22 Thread Munagala V. Ramanath (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Munagala V. Ramanath resolved APEXMALHAR-2264.
--
   Resolution: Fixed
Fix Version/s: 3.6.0

Merged

> Add documentation for jmsInputOperator
> --
>
> Key: APEXMALHAR-2264
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2264
> Project: Apache Apex Malhar
>  Issue Type: Documentation
>  Components: documentation
>Reporter: Sanjay M Pujare
>Assignee: Sanjay M Pujare
>Priority: Minor
> Fix For: 3.6.0
>
>
> Add jmsInputOperator based on the latest enhancements made to that operator



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXMALHAR-2265) Add entries to mkdocs.yml for recently added operator docs

2016-09-22 Thread Munagala V. Ramanath (JIRA)
Munagala V. Ramanath created APEXMALHAR-2265:


 Summary: Add entries to mkdocs.yml for recently added operator docs
 Key: APEXMALHAR-2265
 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2265
 Project: Apache Apex Malhar
  Issue Type: Bug
  Components: documentation
Reporter: Munagala V. Ramanath
Assignee: Munagala V. Ramanath
Priority: Minor


Add entries to mkdocs.yml for recently added operator docs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-core pull request #396: APEXCORE-538 - Log raw data when RPC message fa...

2016-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-core/pull/396


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXCORE-538) Log raw data when RPC message fails to de-serialize

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515202#comment-15515202
 ] 

ASF GitHub Bot commented on APEXCORE-538:
-

Github user asfgit closed the pull request at:

https://github.com/apache/apex-core/pull/396


> Log raw data when RPC message fails to de-serialize 
> 
>
> Key: APEXCORE-538
> URL: https://issues.apache.org/jira/browse/APEXCORE-538
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
> Fix For: 3.3.1, 3.2.2, 3.5.0, 3.4.1
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXMALHAR-2265) Add entries to mkdocs.yml for recently added operator docs

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515218#comment-15515218
 ] 

ASF GitHub Bot commented on APEXMALHAR-2265:


GitHub user amberarrow opened a pull request:

https://github.com/apache/apex-malhar/pull/426

APEXMALHAR-2265 Added entries in mkdocs.yml for 2 newly added docs

@sashadt Please take a look

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amberarrow/incubator-apex-malhar 
APEXMALHAR-2265.add-mkdocs-entries

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/426.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #426


commit 2291d9a0dbca15a6f43c51d9611682269e9e6c0f
Author: Munagala V. Ramanath 
Date:   2016-09-23T03:04:34Z

APEXMALHAR-2265 Added entries in mkdocs.yml for 2 newly added docs




> Add entries to mkdocs.yml for recently added operator docs
> --
>
> Key: APEXMALHAR-2265
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2265
> Project: Apache Apex Malhar
>  Issue Type: Bug
>  Components: documentation
>Reporter: Munagala V. Ramanath
>Assignee: Munagala V. Ramanath
>Priority: Minor
>
> Add entries to mkdocs.yml for recently added operator docs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #426: APEXMALHAR-2265 Added entries in mkdocs.yml f...

2016-09-22 Thread amberarrow
GitHub user amberarrow opened a pull request:

https://github.com/apache/apex-malhar/pull/426

APEXMALHAR-2265 Added entries in mkdocs.yml for 2 newly added docs

@sashadt Please take a look

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/amberarrow/incubator-apex-malhar 
APEXMALHAR-2265.add-mkdocs-entries

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/apex-malhar/pull/426.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #426


commit 2291d9a0dbca15a6f43c51d9611682269e9e6c0f
Author: Munagala V. Ramanath 
Date:   2016-09-23T03:04:34Z

APEXMALHAR-2265 Added entries in mkdocs.yml for 2 newly added docs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (APEXCORE-538) Log raw data when RPC message fails to de-serialize

2016-09-22 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise resolved APEXCORE-538.
---
Resolution: Fixed

> Log raw data when RPC message fails to de-serialize 
> 
>
> Key: APEXCORE-538
> URL: https://issues.apache.org/jira/browse/APEXCORE-538
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
> Fix For: 3.2.2, 3.5.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXCORE-538) Log raw data when RPC message fails to de-serialize

2016-09-22 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise updated APEXCORE-538:
--
Fix Version/s: (was: 3.4.1)
   (was: 3.3.1)

> Log raw data when RPC message fails to de-serialize 
> 
>
> Key: APEXCORE-538
> URL: https://issues.apache.org/jira/browse/APEXCORE-538
> Project: Apache Apex Core
>  Issue Type: Improvement
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
> Fix For: 3.2.2, 3.5.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (APEXMALHAR-2265) Add entries to mkdocs.yml for recently added operator docs

2016-09-22 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise resolved APEXMALHAR-2265.
--
   Resolution: Fixed
Fix Version/s: 3.6.0

> Add entries to mkdocs.yml for recently added operator docs
> --
>
> Key: APEXMALHAR-2265
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2265
> Project: Apache Apex Malhar
>  Issue Type: Bug
>  Components: documentation
>Reporter: Munagala V. Ramanath
>Assignee: Munagala V. Ramanath
>Priority: Minor
> Fix For: 3.6.0
>
>
> Add entries to mkdocs.yml for recently added operator docs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] apex-malhar pull request #426: APEXMALHAR-2265 Added entries in mkdocs.yml f...

2016-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/426


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (APEXMALHAR-2265) Add entries to mkdocs.yml for recently added operator docs

2016-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515375#comment-15515375
 ] 

ASF GitHub Bot commented on APEXMALHAR-2265:


Github user asfgit closed the pull request at:

https://github.com/apache/apex-malhar/pull/426


> Add entries to mkdocs.yml for recently added operator docs
> --
>
> Key: APEXMALHAR-2265
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2265
> Project: Apache Apex Malhar
>  Issue Type: Bug
>  Components: documentation
>Reporter: Munagala V. Ramanath
>Assignee: Munagala V. Ramanath
>Priority: Minor
> Fix For: 3.6.0
>
>
> Add entries to mkdocs.yml for recently added operator docs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Trying to acces live data

2016-09-22 Thread Hitesh Goyal
Hi team,
Trying to access live data from couchbase using input Operators.
It is running fine on my local. The tuples are emitting as Pojo through the 
output port of inputOperator.
But when I upload the .apa file on data torrent, it gets launched but soon it 
goes into Failed State just after been accepted.
Can you please help me to resolve the issue ? I am sending logs as an attached 
file.
Note:- I am not using built in operators. I have made my own operators which 
are quite similar to CouchBasePojoInputOperator and CouchBaseStore.

Regards,
Hitesh Goyal
Simpli5d Technologies
Cont No.: 9996588220

2016-09-22 12:40:30,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
Updating application attempt appattempt_1474540741175_0001_01 with final 
state: FAILED, and exit status: 1
2016-09-22 12:40:30,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1474540741175_0001_01 State change from LAUNCHED to FINAL_SAVING
2016-09-22 12:40:30,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
Unregistering app attempt : appattempt_1474540741175_0001_01
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: 
Application finished, removing password for appattempt_1474540741175_0001_01
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1474540741175_0001_01 State change from FINAL_SAVING to FAILED
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of 
failed attempts is 1. The max attempts is 2
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Application Attempt appattempt_1474540741175_0001_01 is done. 
finalState=FAILED
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: 
Registering app attempt : appattempt_1474540741175_0001_02
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1474540741175_0001_02 State change from NEW to SUBMITTED
2016-09-22 12:40:30,309 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: 
Application application_1474540741175_0001 requests cleared
2016-09-22 12:40:30,310 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
Application removed - appId: application_1474540741175_0001 user: dtadmin 
queue: default #user-pending-applications: 0 #user-active-applications: 0 
#queue-pending-applications: 0 #queue-active-applications: 0
2016-09-22 12:40:30,310 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
Application application_1474540741175_0001 from user: dtadmin activated in 
queue: default
2016-09-22 12:40:30,310 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
Application added - appId: application_1474540741175_0001 user: 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@77d0922b,
 leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 
#queue-pending-applications: 0 #queue-active-applications: 1
2016-09-22 12:40:30,310 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Added Application Attempt appattempt_1474540741175_0001_02 to scheduler 
from user dtadmin in queue default
2016-09-22 12:40:30,310 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1474540741175_0001_02 State change from SUBMITTED to SCHEDULED
2016-09-22 12:40:31,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
 Null container completed...
2016-09-22 12:40:31,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_1474540741175_0001_02_01 Container Transitioned from NEW to 
ALLOCATED
2016-09-22 12:40:31,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dtadmin  
OPERATION=AM Allocated ContainerTARGET=SchedulerApp RESULT=SUCCESS  
APPID=application_1474540741175_0001
CONTAINERID=container_1474540741175_0001_02_01
2016-09-22 12:40:31,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned 
container container_1474540741175_0001_02_01 of capacity  on host localhost.localdomain:35189, which has 1 containers, 
 used and  available after 
allocation
2016-09-22 12:40:31,308 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
assignedContainer application attempt=appattempt_1474540741175_0001_02 
container=Container: [ContainerId: container_1474540741175_0001_02_01, 
NodeId: localhost.localdomain:35189, NodeHttpAddress: 
localhost.localdomain:8042, Resourc

Re: Trying to acces live data

2016-09-22 Thread Priyanka Gugale
Hi Hitesh,

Can you collect all logs including app master logs and send it? Run command
"yarn logs -applicationId " on cluster to get logs for an
application.
Also please collect stderr output logs of appMaster. You can find them
under "/logs/userlogs///"

-Priyanka

On Fri, Sep 23, 2016 at 10:55 AM, Hitesh Goyal 
wrote:

> Hi team,
>
> Trying to access live data from couchbase using input Operators.
>
> It is running fine on my local. The tuples are emitting as Pojo through
> the output port of inputOperator.
>
> But when I upload the .apa file on data torrent, it gets launched but soon
> it goes into Failed State just after been accepted.
>
> Can you please help me to resolve the issue ? I am sending logs as an
> attached file.
>
> Note:- I am not using built in operators. I have made my own operators
> which are quite similar to CouchBasePojoInputOperator and CouchBaseStore.
>
>
>
> Regards,
>
> *Hitesh Goyal*
>
> Simpli5d Technologies
>
> Cont No.: 9996588220
>
>
>