Re: PrintWriter for LogicalPlan
Thanks tushar. I got it working in similar ways: 1. For DAG as JSON String: Map stringObjectMap = LogicalPlanSerializer.convertToMap(dag, true); ObjectMapper mapper = new ObjectMapper(); mapper.enable(SerializationFeature.INDENT_OUTPUT); String s = mapper.writeValueAsString(stringObjectMap); System.out.println(s); 2. For DAG as properties file: StringWriter stringWriter = new StringWriter(); LogicalPlanSerializer.convertToProperties(dag).save(stringWriter); System.out.println(stringWriter.toString()); Either works for me. Thanks for all the help. -Chinmay. On Thu, Sep 22, 2016 at 12:22 PM, Tushar Gosavi wrote: > Hi Chinmay, > > take a look at following gist for example. > https://gist.github.com/anonymous/8df8e05ffc16e620bbc030e1034da3c4 > > - Tushar. > > > On Thu, Sep 22, 2016 at 12:05 PM, Chinmay Kolhatkar > wrote: > > Hi Tushar, > > > > Thanks for the information. Can you please provide me pointers on how to > > use it? > > > > I can always examine the LogicalPlan structure but for a test case, I > think > > its an overkill, instead if I have a utility which can print LogicalPlan > > (similar to LogicalPlanSerializer), then its just the matter for string > > comparison which one has to do in test for verification. > > > > -Chinmay. > > > > > > > > On Thu, Sep 22, 2016 at 11:59 AM, Tushar Gosavi > > wrote: > > > >> There is a LogicalPlanSerializer in apex engine, which will generate > >> json equivalent of DAG, which can be printed easily. > >> But for verification check if right components are added in DAG by > >> examining dag structure than printing it. > >> > >> - Tushar. > >> > >> > >> On Thu, Sep 22, 2016 at 11:39 AM, Chinmay Kolhatkar > >> wrote: > >> > Hi All, > >> > > >> > For testing of calcite generated DAG, I want to verify whether the > >> correct > >> > DAG is created with right operators and streams and whether properties > >> are > >> > set properly. > >> > > >> > Is there any PrintWriter for the LogicalPlan object printing > components > >> of > >> > DAG? > >> > > >> > If not, I am planning to write one for Calcite testing. > >> > I was wondering if that would be useful addition to apex-core for > testing > >> > purpose. > >> > > >> > Please share your opinion. > >> > > >> > Thanks, > >> > Chinmay. > >> >
[jira] [Assigned] (APEXCORE-310) apex cli - support to kill the app by appname
[ https://issues.apache.org/jira/browse/APEXCORE-310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deepak Narkhede reassigned APEXCORE-310: Assignee: Deepak Narkhede > apex cli - support to kill the app by appname > - > > Key: APEXCORE-310 > URL: https://issues.apache.org/jira/browse/APEXCORE-310 > Project: Apache Apex Core > Issue Type: Improvement >Reporter: Sandesh >Assignee: Deepak Narkhede >Priority: Minor > > dtcli should support the ability to kill the app by appname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation
[ https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15512979#comment-15512979 ] Deepak Narkhede commented on APEXCORE-528: -- Hi Alex, Are you currently working on it. If not, Please let me know I would like to take it up. Thanks, Deepak > Output Ports Not Optional by Default During Validation > -- > > Key: APEXCORE-528 > URL: https://issues.apache.org/jira/browse/APEXCORE-528 > Project: Apache Apex Core > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Alex McCullough >Assignee: Alex McCullough >Priority: Minor > > The 'optional' OutputPortFieldAnnotation states that the default value is > true. When you build a DAG with multiple output ports the validator throws an > error telling you at least one must be connected. To fix, you must explicitly > add the annotation to all output ports and set the value to True, which is > supposed to already be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: excluding hadoop dependencies from application package
Definitely worth adding. Ram On Wed, Sep 21, 2016 at 1:20 PM, Pramod Immaneni wrote: > Candidate to be added here? > > https://apex.apache.org/docs/apex/development_best_practices/ > > On Wed, Sep 21, 2016 at 12:24 PM, Munagala Ramanath > wrote: > > > Some info here: > > http://docs.datatorrent.com/troubleshooting/#hadoop- > dependencies-conflicts > > > > Ram > > > > > > On Wed, Sep 21, 2016 at 12:00 PM, Vlad Rozov > > wrote: > > > > > Is subject already documented? > > > > > > Thank you, > > > > > > Vlad > > > > > >
Trying to access live data from couchbase
Hi team, Trying to access live data from couchbase using input Operators. It is running fine on my local. The tuples are emitting as Pojo through the output port of inputOperator. But when I upload the .apa file on data torrent, it gets launched but soon it goes into Failed State just after been accepted. Can you please help me to resolve the issue ? I am sending logs as an attached file. Regards, Hitesh Goyal Simpli5d Technologies Cont No.: 9996588220 2016-09-22 12:40:30,308 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1474540741175_0001_01 with final state: FAILED, and exit status: 1 2016-09-22 12:40:30,308 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1474540741175_0001_01 State change from LAUNCHED to FINAL_SAVING 2016-09-22 12:40:30,308 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1474540741175_0001_01 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1474540741175_0001_01 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1474540741175_0001_01 State change from FINAL_SAVING to FAILED 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 1. The max attempts is 2 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application Attempt appattempt_1474540741175_0001_01 is done. finalState=FAILED 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1474540741175_0001_02 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1474540741175_0001_02 State change from NEW to SUBMITTED 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1474540741175_0001 requests cleared 2016-09-22 12:40:30,310 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1474540741175_0001 user: dtadmin queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0 2016-09-22 12:40:30,310 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application application_1474540741175_0001 from user: dtadmin activated in queue: default 2016-09-22 12:40:30,310 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1474540741175_0001 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@77d0922b, leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 #queue-pending-applications: 0 #queue-active-applications: 1 2016-09-22 12:40:30,310 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Added Application Attempt appattempt_1474540741175_0001_02 to scheduler from user dtadmin in queue default 2016-09-22 12:40:30,310 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1474540741175_0001_02 State change from SUBMITTED to SCHEDULED 2016-09-22 12:40:31,308 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed... 2016-09-22 12:40:31,308 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1474540741175_0001_02_01 Container Transitioned from NEW to ALLOCATED 2016-09-22 12:40:31,308 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dtadmin OPERATION=AM Allocated ContainerTARGET=SchedulerApp RESULT=SUCCESS APPID=application_1474540741175_0001 CONTAINERID=container_1474540741175_0001_02_01 2016-09-22 12:40:31,308 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1474540741175_0001_02_01 of capacity on host localhost.localdomain:35189, which has 1 containers, used and available after allocation 2016-09-22 12:40:31,308 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: assignedContainer application attempt=appattempt_1474540741175_0001_02 container=Container: [ContainerId: container_1474540741175_0001_02_01, NodeId: localhost.localdomain:35189, NodeHttpAddress: localhost.localdomain:8042, Resource: , Priority: 0, Token: null, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=, usedCapacity=0.0, absoluteUsedCapacity=0.0,
[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation
[ https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513421#comment-15513421 ] Alex McCullough commented on APEXCORE-528: -- Hey Deepak, I am working on this. I did want to clarify the exact rules that should be in place though for the output ports. I see in the comments it states: // multiple ports w/o annotation, one of them must be connected Is this truly the rule? My proposal would be for output ports, there is never a requirement to connect them unless the annotation explicitly marks it as optional = false. Thanks, Alex > Output Ports Not Optional by Default During Validation > -- > > Key: APEXCORE-528 > URL: https://issues.apache.org/jira/browse/APEXCORE-528 > Project: Apache Apex Core > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Alex McCullough >Assignee: Alex McCullough >Priority: Minor > > The 'optional' OutputPortFieldAnnotation states that the default value is > true. When you build a DAG with multiple output ports the validator throws an > error telling you at least one must be connected. To fix, you must explicitly > add the annotation to all output ports and set the value to True, which is > supposed to already be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation
[ https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513460#comment-15513460 ] Thomas Weise commented on APEXCORE-528: --- IMO output ports that don't have an annotation should be considered optional, that's also the default in the annotation. > Output Ports Not Optional by Default During Validation > -- > > Key: APEXCORE-528 > URL: https://issues.apache.org/jira/browse/APEXCORE-528 > Project: Apache Apex Core > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Alex McCullough >Assignee: Alex McCullough >Priority: Minor > > The 'optional' OutputPortFieldAnnotation states that the default value is > true. When you build a DAG with multiple output ports the validator throws an > error telling you at least one must be connected. To fix, you must explicitly > add the annotation to all output ports and set the value to True, which is > supposed to already be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation
[ https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513507#comment-15513507 ] Alex McCullough commented on APEXCORE-528: -- Because when no annotations are added the OutputPortMeta.portAnnotation is null, it can't find the defaults to populate. Is there any issue with using reflection to populate defaults when no explicit annotation value is provided? Then you can just have standard logic and not worry about null objects and such. > Output Ports Not Optional by Default During Validation > -- > > Key: APEXCORE-528 > URL: https://issues.apache.org/jira/browse/APEXCORE-528 > Project: Apache Apex Core > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Alex McCullough >Assignee: Alex McCullough >Priority: Minor > > The 'optional' OutputPortFieldAnnotation states that the default value is > true. When you build a DAG with multiple output ports the validator throws an > error telling you at least one must be connected. To fix, you must explicitly > add the annotation to all output ports and set the value to True, which is > supposed to already be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation
[ https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513515#comment-15513515 ] Thomas Weise commented on APEXCORE-528: --- Can you point me to the exact place in the code you are referring to? > Output Ports Not Optional by Default During Validation > -- > > Key: APEXCORE-528 > URL: https://issues.apache.org/jira/browse/APEXCORE-528 > Project: Apache Apex Core > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Alex McCullough >Assignee: Alex McCullough >Priority: Minor > > The 'optional' OutputPortFieldAnnotation states that the default value is > true. When you build a DAG with multiple output ports the validator throws an > error telling you at least one must be connected. To fix, you must explicitly > add the annotation to all output ports and set the value to True, which is > supposed to already be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation
[ https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513525#comment-15513525 ] Alex McCullough commented on APEXCORE-528: -- https://github.com/apache/apex-core/blob/master/engine/src/main/java/com/datatorrent/stram/plan/logical/LogicalPlan.java#L1748-L1773 > Output Ports Not Optional by Default During Validation > -- > > Key: APEXCORE-528 > URL: https://issues.apache.org/jira/browse/APEXCORE-528 > Project: Apache Apex Core > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Alex McCullough >Assignee: Alex McCullough >Priority: Minor > > The 'optional' OutputPortFieldAnnotation states that the default value is > true. When you build a DAG with multiple output ports the validator throws an > error telling you at least one must be connected. To fix, you must explicitly > add the annotation to all output ports and set the value to True, which is > supposed to already be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXCORE-538) Log raw data when RPC message fails to de-serialize
Vlad Rozov created APEXCORE-538: --- Summary: Log raw data when RPC message fails to de-serialize Key: APEXCORE-538 URL: https://issues.apache.org/jira/browse/APEXCORE-538 Project: Apache Apex Core Issue Type: Improvement Reporter: Vlad Rozov Assignee: Vlad Rozov Fix For: 3.2.2, 3.3.1, 3.4.1, 3.5.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXMALHAR-2261) Python binding for high level API
[ https://issues.apache.org/jira/browse/APEXMALHAR-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated APEXMALHAR-2261: - Labels: roadmap (was: ) > Python binding for high level API > - > > Key: APEXMALHAR-2261 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2261 > Project: Apache Apex Malhar > Issue Type: New Feature >Reporter: Thomas Weise > Labels: roadmap > > A high level API similar to the Apex Java stream API that lets users specify > an application in Python. > https://lists.apache.org/thread.html/9837b1dee8f909ed400c6030ce5c6a94a12f43183718019dd0bfd228@%3Cdev.apex.apache.org%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Python support
I created the following JIRAs: https://issues.apache.org/jira/browse/APEXMALHAR-2260 https://issues.apache.org/jira/browse/APEXMALHAR-2261 On Wed, Sep 21, 2016 at 11:10 PM, Chinmay Kolhatkar wrote: > I would like to help in contributing to this feature. > > On Wed, Sep 21, 2016 at 12:26 AM, Sasha Parfenov > wrote: > > > +1 on both executing Python code in an operator and high level API for > > constructing Pipelines in Python. > > > > There is a large user base of engineers and data scientists which use > > Python on regular basis for crunching through big data. Providing them > > with a powerful new platform for big data processing, wrapped in a > familiar > > language, will open Apex to a much broader user base and help grow the > > project. > > > > Given the potentially new user base of Python developers, it may make > sense > > to prioritize the high level API for pipeline construction. This will > > allow users to build simple applications with existing library operators, > > and we can get feedback on what areas they would like to see improved > next > > - custom Python operator support or more built-in library operators. > > > > Thanks, > > Sasha > > > > On Thu, Sep 15, 2016 at 2:06 PM, Thomas Weise wrote: > > > > > Hi, > > > > > > Python (not Jython) seems to be a popular language and frequently used > > for > > > data analysis, especially where flexibility matters. It has a > > comprehensive > > > library and it is generally considered low barrier to entry. I have > also > > > seen Python used in critical back-end components, although that's > > probably > > > not very common? > > > > > > I think Python support could potentially expand the user base for Apex. > > > There are 2 main areas that can be considered: > > > > > > 1) Support to execute Python code through an operator > > > 2) A client API that lets users construct pipelines in Python > > > > > > The former can exist without the latter. And it would enable users to > > > leverage existing code that otherwise would have to be rewritten in a > JVM > > > language. The engine could ship scripts/packages so they are > > automatically > > > distributed on the cluster. > > > > > > A useful client API probably requires back-end support for lambda > > functions > > > and more complex UDFs. > > > > > > Would be great to get some feedback, especially from those that have > > > experience with Python, on how an integration could potentially open up > > new > > > use cases for Apex. > > > > > > Thanks, > > > Thomas > > > > > >
[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation
[ https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513735#comment-15513735 ] Thomas Weise commented on APEXCORE-528: --- Isn't it just a matter of removing the allPortsOptional check? > Output Ports Not Optional by Default During Validation > -- > > Key: APEXCORE-528 > URL: https://issues.apache.org/jira/browse/APEXCORE-528 > Project: Apache Apex Core > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Alex McCullough >Assignee: Alex McCullough >Priority: Minor > > The 'optional' OutputPortFieldAnnotation states that the default value is > true. When you build a DAG with multiple output ports the validator throws an > error telling you at least one must be connected. To fix, you must explicitly > add the annotation to all output ports and set the value to True, which is > supposed to already be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXMALHAR-2261) Python binding for high level API
Thomas Weise created APEXMALHAR-2261: Summary: Python binding for high level API Key: APEXMALHAR-2261 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2261 Project: Apache Apex Malhar Issue Type: New Feature Reporter: Thomas Weise A high level API similar to the Apex Java stream API that lets users specify an application in Python. https://lists.apache.org/thread.html/9837b1dee8f909ed400c6030ce5c6a94a12f43183718019dd0bfd228@%3Cdev.apex.apache.org%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXMALHAR-2130) implement scalable windowed storage
[ https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated APEXMALHAR-2130: - Labels: roadmap (was: ) > implement scalable windowed storage > --- > > Key: APEXMALHAR-2130 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2130 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: bright chen >Assignee: David Yan > Labels: roadmap > > This feature is used for supporting windowing. > The storage needs to have the following features: > 1. Spillable key value storage (integrate with APEXMALHAR-2026) > 2. Upon checkpoint, it saves a snapshot for the entire data set with the > checkpointing window id. This should be done incrementally (ManagedState) to > avoid wasting space with unchanged data > 3. When recovering, it takes the recovery window id and restores to that > snapshot > 4. When a window is committed, all windows with a lower ID should be purged > from the store. > 5. It should implement the WindowedStorage and WindowedKeyedStorage > interfaces, and because of 2 and 3, we may want to add methods to the > WindowedStorage interface so that the implementation of WindowedOperator can > notify the storage of checkpointing, recovering and committing of a window. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXMALHAR-2130) Scalable windowed storage
[ https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated APEXMALHAR-2130: - Summary: Scalable windowed storage (was: implement scalable windowed storage) > Scalable windowed storage > - > > Key: APEXMALHAR-2130 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2130 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: bright chen >Assignee: David Yan > Labels: roadmap > > This feature is used for supporting windowing. > The storage needs to have the following features: > 1. Spillable key value storage (integrate with APEXMALHAR-2026) > 2. Upon checkpoint, it saves a snapshot for the entire data set with the > checkpointing window id. This should be done incrementally (ManagedState) to > avoid wasting space with unchanged data > 3. When recovering, it takes the recovery window id and restores to that > snapshot > 4. When a window is committed, all windows with a lower ID should be purged > from the store. > 5. It should implement the WindowedStorage and WindowedKeyedStorage > interfaces, and because of 2 and 3, we may want to add methods to the > WindowedStorage interface so that the implementation of WindowedOperator can > notify the storage of checkpointing, recovering and committing of a window. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXMALHAR-2260) Python execution for operator logic
Thomas Weise created APEXMALHAR-2260: Summary: Python execution for operator logic Key: APEXMALHAR-2260 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2260 Project: Apache Apex Malhar Issue Type: New Feature Reporter: Thomas Weise Support execution of Python code in an operator. https://lists.apache.org/thread.html/9837b1dee8f909ed400c6030ce5c6a94a12f43183718019dd0bfd228@%3Cdev.apex.apache.org%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-2244) Optimize WindowedStorage and Spillable data structures for time series
[ https://issues.apache.org/jira/browse/APEXMALHAR-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513770#comment-15513770 ] Siyuan Hua commented on APEXMALHAR-2244: Each spillable DS implementation use a SpillableStateStore to store things and we can make ManagedTimeUnifiedStateImpl implement the store as well and it can take some time extract function to get/calculate time and time buckets from each V/KV data. And the Store can be setup by the WindowedOperator, correct? > Optimize WindowedStorage and Spillable data structures for time series > -- > > Key: APEXMALHAR-2244 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2244 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan >Assignee: Siyuan Hua > > The spillable data structures currently does not make any assumption about > the key that is used in Managed State, and as a result, it uses > ManagedStateImpl to interface with Managed State and uses time buckets that > are based on the apex window id. But for WindowedStorage used by > WindowedOperator, the key to the storage is a window, which is event time > based. Using the default ManagedStateImpl would be very inefficient for event > time based keys, since it would write data that would belong to the same > window to different time buckets. > On a high level, the below summarizes roughly what needs to be done: > 1. a way to tell the spillable data structures to use the > ManagedTimeUnifiedStateImpl > 2. a way to tell the spillable data structures how to extract the timestamp > from the key. Note that in the case of WindowedOperator, the timestamp should > be the end timestamp of the window (beginTimeMillis + durationMillis), not > the begin timestamp. > 3. a way to tell the spillable data structures how to assign the time bucket > given that timestamp > 4. with point 3, the spillable implementations of WindowedStorage will need > to take a config parameter that says how much time (in millis) is each time > bucket > 5. only purge a time bucket when all keys that belong to that time bucket are > removed and the apex window id of the first window in which the keys are all > removed has been committed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-core pull request #396: APEXCORE-538 - Log raw data when RPC message fa...
GitHub user vrozov opened a pull request: https://github.com/apache/apex-core/pull/396 APEXCORE-538 - Log raw data when RPC message fails to de-serialize @tweise Please review and cherry-pick. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/apex-core APEXCORE-538 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-core/pull/396.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #396 commit e0ee2d105ad4091eee2e14cf4c519bdb732734d8 Author: Vlad Rozov Date: 2016-09-22T16:31:32Z APEXCORE-538 - Log raw data when RPC message fails to de-serialize --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXCORE-538) Log raw data when RPC message fails to de-serialize
[ https://issues.apache.org/jira/browse/APEXCORE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513785#comment-15513785 ] ASF GitHub Bot commented on APEXCORE-538: - GitHub user vrozov opened a pull request: https://github.com/apache/apex-core/pull/396 APEXCORE-538 - Log raw data when RPC message fails to de-serialize @tweise Please review and cherry-pick. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/apex-core APEXCORE-538 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-core/pull/396.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #396 commit e0ee2d105ad4091eee2e14cf4c519bdb732734d8 Author: Vlad Rozov Date: 2016-09-22T16:31:32Z APEXCORE-538 - Log raw data when RPC message fails to de-serialize > Log raw data when RPC message fails to de-serialize > > > Key: APEXCORE-538 > URL: https://issues.apache.org/jira/browse/APEXCORE-538 > Project: Apache Apex Core > Issue Type: Improvement >Reporter: Vlad Rozov >Assignee: Vlad Rozov > Fix For: 3.3.1, 3.2.2, 3.5.0, 3.4.1 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #408: APEXMALHAR-2248 Added SpillableSet and Spilla...
Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/408 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXMALHAR-2248) Create SpillableSet and SpillableSetMultimap interfaces and implementation
[ https://issues.apache.org/jira/browse/APEXMALHAR-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513787#comment-15513787 ] ASF GitHub Bot commented on APEXMALHAR-2248: Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/408 > Create SpillableSet and SpillableSetMultimap interfaces and implementation > -- > > Key: APEXMALHAR-2248 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2248 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan >Assignee: David Yan > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXMALHAR-2248) Create SpillableSet and SpillableSetMultimap interfaces and implementation
[ https://issues.apache.org/jira/browse/APEXMALHAR-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyuan Hua updated APEXMALHAR-2248: --- Fix Version/s: 3.6.0 > Create SpillableSet and SpillableSetMultimap interfaces and implementation > -- > > Key: APEXMALHAR-2248 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2248 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan >Assignee: David Yan > Fix For: 3.6.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (APEXMALHAR-2248) Create SpillableSet and SpillableSetMultimap interfaces and implementation
[ https://issues.apache.org/jira/browse/APEXMALHAR-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyuan Hua resolved APEXMALHAR-2248. Resolution: Fixed > Create SpillableSet and SpillableSetMultimap interfaces and implementation > -- > > Key: APEXMALHAR-2248 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2248 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan >Assignee: David Yan > Fix For: 3.6.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #404: APEXMALHAR-2190 #resolve #comment Use reusabl...
Github user brightchen closed the pull request at: https://github.com/apache/apex-malhar/pull/404 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #404: APEXMALHAR-2190 #resolve #comment Use reusabl...
GitHub user brightchen reopened a pull request: https://github.com/apache/apex-malhar/pull/404 APEXMALHAR-2190 #resolve #comment Use reusable buffer to serial spill⦠â¦able data structure You can merge this pull request into a Git repository by running: $ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2190-PR Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/404.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #404 commit b17ca37cf7f0ae8037ce69206e9ae914f8168d33 Author: brightchen Date: 2016-08-16T00:46:27Z APEXMALHAR-2190 #resolve #comment Use reusable buffer to serial spillable data structure --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXMALHAR-2190) Use reusable buffer to serial spillable data structure
[ https://issues.apache.org/jira/browse/APEXMALHAR-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513929#comment-15513929 ] ASF GitHub Bot commented on APEXMALHAR-2190: Github user brightchen closed the pull request at: https://github.com/apache/apex-malhar/pull/404 > Use reusable buffer to serial spillable data structure > -- > > Key: APEXMALHAR-2190 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2190 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: bright chen >Assignee: bright chen > Original Estimate: 240h > Remaining Estimate: 240h > > Spillable Data Structure created lots of temporary memory to serial data lot > of of memory copy( see SliceUtils.concatenate(byte[], byte[]). Which used up > memory very quickly. See APEXMALHAR-2182. > Use a shared memory to avoid allocate temporary memory and memory copy > some basic ideas > - SerToLVBuffer interface provides a method serTo(T object, LengthValueBuffer > buffer): instead of create a memory and then return the serialized data, this > method let the caller pass in the buffer. So different objects or object with > embed objects can share the same LengthValueBuffer > - LengthValueBuffer: It is a buffer which manage the memory as length and > value(which is the generic format of serialized data). which provide length > placeholder mechanism to avoid temporary memory and data copy when the length > can be know after data serialized > - memory management classes: includes interface ByteStream and it's > implementations: Block, FixedBlock, BlocksStream. Which provides a mechanism > to dynamic allocate and manage memory. Which basically provides following > function. I tried other some other stream mechamism such as > ByteArrayInputStream, but it can meet 3rd criteria, and don't have good > performance(50% loss) > - dynamic allocate memory > - reset memory for reuse > - BlocksStream make sure the output slices will not be changed when need > extra memory; Block can change the reference of output slices buffer is data > was moved due to reallocate of memory(BlocksStream is better solution). > - WindowableBlocksStream extends from BlocksStream and provides function to > reset memory window by window instead of reset all memory. It provides > certain amount of cache( as bytes ) in memory -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-2190) Use reusable buffer to serial spillable data structure
[ https://issues.apache.org/jira/browse/APEXMALHAR-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513930#comment-15513930 ] ASF GitHub Bot commented on APEXMALHAR-2190: GitHub user brightchen reopened a pull request: https://github.com/apache/apex-malhar/pull/404 APEXMALHAR-2190 #resolve #comment Use reusable buffer to serial spill… …able data structure You can merge this pull request into a Git repository by running: $ git pull https://github.com/brightchen/apex-malhar APEXMALHAR-2190-PR Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/404.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #404 commit b17ca37cf7f0ae8037ce69206e9ae914f8168d33 Author: brightchen Date: 2016-08-16T00:46:27Z APEXMALHAR-2190 #resolve #comment Use reusable buffer to serial spillable data structure > Use reusable buffer to serial spillable data structure > -- > > Key: APEXMALHAR-2190 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2190 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: bright chen >Assignee: bright chen > Original Estimate: 240h > Remaining Estimate: 240h > > Spillable Data Structure created lots of temporary memory to serial data lot > of of memory copy( see SliceUtils.concatenate(byte[], byte[]). Which used up > memory very quickly. See APEXMALHAR-2182. > Use a shared memory to avoid allocate temporary memory and memory copy > some basic ideas > - SerToLVBuffer interface provides a method serTo(T object, LengthValueBuffer > buffer): instead of create a memory and then return the serialized data, this > method let the caller pass in the buffer. So different objects or object with > embed objects can share the same LengthValueBuffer > - LengthValueBuffer: It is a buffer which manage the memory as length and > value(which is the generic format of serialized data). which provide length > placeholder mechanism to avoid temporary memory and data copy when the length > can be know after data serialized > - memory management classes: includes interface ByteStream and it's > implementations: Block, FixedBlock, BlocksStream. Which provides a mechanism > to dynamic allocate and manage memory. Which basically provides following > function. I tried other some other stream mechamism such as > ByteArrayInputStream, but it can meet 3rd criteria, and don't have good > performance(50% loss) > - dynamic allocate memory > - reset memory for reuse > - BlocksStream make sure the output slices will not be changed when need > extra memory; Block can change the reference of output slices buffer is data > was moved due to reallocate of memory(BlocksStream is better solution). > - WindowableBlocksStream extends from BlocksStream and provides function to > reset memory window by window instead of reset all memory. It provides > certain amount of cache( as bytes ) in memory -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXCORE-539) DefaultAttributeMap is not thread safe
Vlad Rozov created APEXCORE-539: --- Summary: DefaultAttributeMap is not thread safe Key: APEXCORE-539 URL: https://issues.apache.org/jira/browse/APEXCORE-539 Project: Apache Apex Core Issue Type: Bug Reporter: Vlad Rozov DefaultAttributeMap put() and get() methods may be called from different threads and current implementation that uses HashMap is not thread safe. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXCORE-539) DefaultAttributeMap is not thread safe
[ https://issues.apache.org/jira/browse/APEXCORE-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15513980#comment-15513980 ] Thomas Weise commented on APEXCORE-539: --- It wasn't the original intention to modify the map once it is shared between threads. Where is it happening? > DefaultAttributeMap is not thread safe > -- > > Key: APEXCORE-539 > URL: https://issues.apache.org/jira/browse/APEXCORE-539 > Project: Apache Apex Core > Issue Type: Bug >Reporter: Vlad Rozov > > DefaultAttributeMap put() and get() methods may be called from different > threads and current implementation that uses HashMap is not thread safe. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXCORE-539) DefaultAttributeMap is not thread safe
[ https://issues.apache.org/jira/browse/APEXCORE-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514021#comment-15514021 ] Vlad Rozov commented on APEXCORE-539: - I see the following exception after I changed DefaultAttributeMap to store and compare thread with the current thread: {noformat} 2016-09-22 10:36:46,849 [container-0] ERROR stram.StramLocalCluster run - Container container-0 failed java.util.ConcurrentModificationException: current thread Thread[container-0,5,main] existing thread Thread[main,5,main] at com.datatorrent.api.Attribute$AttributeMap$DefaultAttributeMap.put(Attribute.java:209) at com.datatorrent.stram.engine.StreamingContainer.setup(StreamingContainer.java:160) at com.datatorrent.stram.StramLocalCluster$LocalStreamingContainer.run(StramLocalCluster.java:178) at com.datatorrent.stram.StramLocalCluster$LocalStramChildLauncher.run(StramLocalCluster.java:269) at java.lang.Thread.run(Thread.java:745) {noformat} Why not to use ConcurrentHashMap instead of HashMap? > DefaultAttributeMap is not thread safe > -- > > Key: APEXCORE-539 > URL: https://issues.apache.org/jira/browse/APEXCORE-539 > Project: Apache Apex Core > Issue Type: Bug >Reporter: Vlad Rozov > > DefaultAttributeMap put() and get() methods may be called from different > threads and current implementation that uses HashMap is not thread safe. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXCORE-532) New dynamically added operator does not start with correct windowId.
[ https://issues.apache.org/jira/browse/APEXCORE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514071#comment-15514071 ] Pramod Immaneni commented on APEXCORE-532: -- [~tushargosavi] checking why this is broken when it used to work before. > New dynamically added operator does not start with correct windowId. > > > Key: APEXCORE-532 > URL: https://issues.apache.org/jira/browse/APEXCORE-532 > Project: Apache Apex Core > Issue Type: Bug >Reporter: Tushar Gosavi >Priority: Critical > > During dynamic DAG change, If new operator is added and connected to existing > operator, it does not starts with correct windowId. The baseSeconds is set to > 0 causing windowId management problems at master effectively halting purge > from buffer server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXMALHAR-2262) lock on is too AbstractManagedStateImpl.getValueFromBucketSync wide
bright chen created APEXMALHAR-2262: --- Summary: lock on is too AbstractManagedStateImpl.getValueFromBucketSync wide Key: APEXMALHAR-2262 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2262 Project: Apache Apex Malhar Issue Type: Improvement Reporter: bright chen Assignee: bright chen The Managed State used a lot of lock, which could impact a lot on performance. AbstractManagedStateImpl.getValueFromBucketSync(long, long, Slice) lock the buck to get value, But if the value still in memory, the lock is not necessary as flash is ConcurrentMap. probably AbstractManagedStateImpl should only lock when add/remove bucket. And bucket handle read/write lock inside bucket - -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXCORE-532) New dynamically added operator does not start with correct windowId.
[ https://issues.apache.org/jira/browse/APEXCORE-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514117#comment-15514117 ] Tushar Gosavi commented on APEXCORE-532: If newly added operator is attached to some operator which was already exists in the DAG, - It starts from activation checkpoint - bufferserver subscriber use activation checkpoint id to set the initial baseSeconds (setup of BufferServerSubscriber). - when new data is received, we get partital windowId form tuple, or it with baseSeconds and construct full windowId. baseSeconds is expected to set to correct value if operator starts from known checkpoint, during initial dag deployment input operators sends resetWindow tuple, which sets baseSeconds for all downstream operators. In this case operator was added in the end and was started from activation checkpoint. but no resetWindow tuple was received to set its baseSeconds to correct value. In this case the starting checkpoint is different than windowId, may be we should use two fields in deploy info - one for checkpoint from where to load the state - starting baseSeconds which is taken from existing upstream to correctly set the windowIds in absense of resetWindow tuple. > New dynamically added operator does not start with correct windowId. > > > Key: APEXCORE-532 > URL: https://issues.apache.org/jira/browse/APEXCORE-532 > Project: Apache Apex Core > Issue Type: Bug >Reporter: Tushar Gosavi >Priority: Critical > > During dynamic DAG change, If new operator is added and connected to existing > operator, it does not starts with correct windowId. The baseSeconds is set to > 0 causing windowId management problems at master effectively halting purge > from buffer server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXMALHAR-2262) lock on AbstractManagedStateImpl.getValueFromBucketSync is too wide
[ https://issues.apache.org/jira/browse/APEXMALHAR-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bright chen updated APEXMALHAR-2262: Summary: lock on AbstractManagedStateImpl.getValueFromBucketSync is too wide (was: lock on is too AbstractManagedStateImpl.getValueFromBucketSync wide) > lock on AbstractManagedStateImpl.getValueFromBucketSync is too wide > --- > > Key: APEXMALHAR-2262 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2262 > Project: Apache Apex Malhar > Issue Type: Improvement >Reporter: bright chen >Assignee: bright chen > > The Managed State used a lot of lock, which could impact a lot on > performance. > AbstractManagedStateImpl.getValueFromBucketSync(long, long, Slice) lock the > buck to get value, But if the value still in memory, the lock is not > necessary as flash is ConcurrentMap. > probably AbstractManagedStateImpl should only lock when add/remove bucket. > And bucket handle read/write lock inside bucket > - -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXMALHAR-2263) Offsets in AbstractFileInputOperator should be long rather than int
Munagala V. Ramanath created APEXMALHAR-2263: Summary: Offsets in AbstractFileInputOperator should be long rather than int Key: APEXMALHAR-2263 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2263 Project: Apache Apex Malhar Issue Type: Bug Components: adapters other Reporter: Munagala V. Ramanath Offsets in AbstractFileInputOperator use the int type which means files with more that (2**31 -1) records will cause overflows and mysterious failures. Should be changed to use long. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Trying to access live data from couchbase
Check Application master logs and stderr output. Thank you, Vlad On 9/22/16 04:09, Hitesh Goyal wrote: Hi team, Trying to access live data from couchbase using input Operators. It is running fine on my local. The tuples are emitting as Pojo through the output port of inputOperator. But when I upload the .apa file on data torrent, it gets launched but soon it goes into Failed State just after been accepted. Can you please help me to resolve the issue ? I am sending logs as an attached file. Regards, *Hitesh Goyal* Simpli5d Technologies Cont No.: 9996588220
[GitHub] apex-malhar pull request #399: REVIEW ONLY: exposing seeking and iterating o...
Github user davidyan74 closed the pull request at: https://github.com/apache/apex-malhar/pull/399 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] apex-malhar pull request #377: *Review Only* APEXMALHAR-2192 Added some spil...
Github user davidyan74 closed the pull request at: https://github.com/apache/apex-malhar/pull/377 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXMALHAR-2192) Implement SpillableByteArrayListMultimapImpl.removeAll(Object key)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514345#comment-15514345 ] ASF GitHub Bot commented on APEXMALHAR-2192: Github user davidyan74 closed the pull request at: https://github.com/apache/apex-malhar/pull/377 > Implement SpillableByteArrayListMultimapImpl.removeAll(Object key) > -- > > Key: APEXMALHAR-2192 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2192 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan > > This is needed by the spillable implementation of > SpillableWindowedKeyedStorage. In the implementation of WindowedOperator, > when purging a window, we need to purge all keys associated with a given > window. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (APEXMALHAR-2191) Implement a way to get all keys with or without prefix from SpillableByteMapImpl
[ https://issues.apache.org/jira/browse/APEXMALHAR-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan closed APEXMALHAR-2191. - > Implement a way to get all keys with or without prefix from > SpillableByteMapImpl > > > Key: APEXMALHAR-2191 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2191 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan >Assignee: David Yan >Priority: Critical > > WindowedKeyedStorage is basically a Map>, and we need the > capability of getting all K's given a window, and getting V given a window > and K. > Currently, the spillable implementation of WindowedKeyedStorage uses two > spillable data structures -- SpillableByteMapImpl, V> and > SpillableArrayListMultimapImpl. This will not work because we > need to be able to remove a key from a window, which SpillableArrayList does > not support, and having two separate spillable data structures if it can be > achieved by just one should be avoided in general. > We will implement the solution that supports prefix scanning (given a window, > return all keys), which requires the keys to be stored in order in managed > state. > Traversing all keys in the spillable map is needed by WindowStateMap. The > WindowedOperator needs a way to traverse all windows in its state when firing > a trigger. However, this is less urgent since the Window meta info is small > and should fit in memory even with millions of windows. This is more for the > checkpointing efficiency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (APEXMALHAR-2192) Implement SpillableByteArrayListMultimapImpl.removeAll(Object key)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan closed APEXMALHAR-2192. - > Implement SpillableByteArrayListMultimapImpl.removeAll(Object key) > -- > > Key: APEXMALHAR-2192 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2192 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan > > This is needed by the spillable implementation of > SpillableWindowedKeyedStorage. In the implementation of WindowedOperator, > when purging a window, we need to purge all keys associated with a given > window. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (APEXMALHAR-2189) Implement SpillableArrayListImpl.iterator
[ https://issues.apache.org/jira/browse/APEXMALHAR-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan closed APEXMALHAR-2189. - > Implement SpillableArrayListImpl.iterator > - > > Key: APEXMALHAR-2189 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2189 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan > > This is needed by the spiillable storage implementation of WindowedOperator > so that it can iterate through all the keys in the window to determine > whether or not to fire a trigger. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (APEXMALHAR-2188) Implement SpillableByteArrayListMultimapImpl.containsEntry
[ https://issues.apache.org/jira/browse/APEXMALHAR-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan closed APEXMALHAR-2188. - > Implement SpillableByteArrayListMultimapImpl.containsEntry > -- > > Key: APEXMALHAR-2188 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2188 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan > > This method is needed by SpillableWindowedKeyedStorage.put(). It needs to > find out whether the key already exists in the window before pushing to the > Spillable ArrayList. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXMALHAR-2191) Implement a way to get all keys with or without prefix from SpillableByteMapImpl
[ https://issues.apache.org/jira/browse/APEXMALHAR-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan updated APEXMALHAR-2191: -- Assignee: (was: David Yan) > Implement a way to get all keys with or without prefix from > SpillableByteMapImpl > > > Key: APEXMALHAR-2191 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2191 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan >Priority: Critical > > WindowedKeyedStorage is basically a Map>, and we need the > capability of getting all K's given a window, and getting V given a window > and K. > Currently, the spillable implementation of WindowedKeyedStorage uses two > spillable data structures -- SpillableByteMapImpl, V> and > SpillableArrayListMultimapImpl. This will not work because we > need to be able to remove a key from a window, which SpillableArrayList does > not support, and having two separate spillable data structures if it can be > achieved by just one should be avoided in general. > We will implement the solution that supports prefix scanning (given a window, > return all keys), which requires the keys to be stored in order in managed > state. > Traversing all keys in the spillable map is needed by WindowStateMap. The > WindowedOperator needs a way to traverse all windows in its state when firing > a trigger. However, this is less urgent since the Window meta info is small > and should fit in memory even with millions of windows. This is more for the > checkpointing efficiency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXMALHAR-2249) Create SpillableSet and SpillableSetMultimap interfaces and implementation
[ https://issues.apache.org/jira/browse/APEXMALHAR-2249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan updated APEXMALHAR-2249: -- Assignee: (was: David Yan) > Create SpillableSet and SpillableSetMultimap interfaces and implementation > -- > > Key: APEXMALHAR-2249 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2249 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (APEXMALHAR-2193) Implement SpillableByteArrayListMultimapImpl.remove(@Nullable Object key, @Nullable Object value)
[ https://issues.apache.org/jira/browse/APEXMALHAR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan closed APEXMALHAR-2193. - > Implement SpillableByteArrayListMultimapImpl.remove(@Nullable Object key, > @Nullable Object value) > - > > Key: APEXMALHAR-2193 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2193 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan > > This is needed by SpillableSessionWindowedStorage. It needs a way to remove > the session window given the key in its internal keyToWindowsMap -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (APEXMALHAR-2231) Implement a Spillable map that takes two keys
[ https://issues.apache.org/jira/browse/APEXMALHAR-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Yan closed APEXMALHAR-2231. - Assignee: (was: David Yan) > Implement a Spillable map that takes two keys > - > > Key: APEXMALHAR-2231 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2231 > Project: Apache Apex Malhar > Issue Type: Sub-task >Reporter: David Yan > > This is similar to Map> with the ability to get all K2's given > a K1, and remove all entries with a given K1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #423: APEXMALHAR-2241 - The metadata kafka consumer...
GitHub user venkateshkottapalli opened a pull request: https://github.com/apache/apex-malhar/pull/423 APEXMALHAR-2241 - The metadata kafka consumer should also pickup the properties setting on the kafka input operator Fixes two issues - * metadata kafka consumer not picking up properties set * Consumer properties set from Properties.xml are not getting picked. * Added test case to validate the consumer Properties are not reset. @siyuanh Please review. You can merge this pull request into a Git repository by running: $ git pull https://github.com/venkateshkottapalli/apex-malhar APEXMALHAR-2241-KafkaConsumer-MetaDataProps Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/423.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #423 commit 02c040ac184396602116f6a1eec26b0c838a6d23 Author: venkateshDT Date: 2016-09-22T20:27:32Z fixes two issues - pick the metadata consumer properties and setting the consumer properties from Properties.xml for kafka inputoperator not working added test case Removed trailing spaces Removed unused imports --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXMALHAR-2241) The metadata kafka consumer should also pickup the properties setting on the kafka input operator.
[ https://issues.apache.org/jira/browse/APEXMALHAR-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514640#comment-15514640 ] ASF GitHub Bot commented on APEXMALHAR-2241: GitHub user venkateshkottapalli opened a pull request: https://github.com/apache/apex-malhar/pull/423 APEXMALHAR-2241 - The metadata kafka consumer should also pickup the properties setting on the kafka input operator Fixes two issues - * metadata kafka consumer not picking up properties set * Consumer properties set from Properties.xml are not getting picked. * Added test case to validate the consumer Properties are not reset. @siyuanh Please review. You can merge this pull request into a Git repository by running: $ git pull https://github.com/venkateshkottapalli/apex-malhar APEXMALHAR-2241-KafkaConsumer-MetaDataProps Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/423.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #423 commit 02c040ac184396602116f6a1eec26b0c838a6d23 Author: venkateshDT Date: 2016-09-22T20:27:32Z fixes two issues - pick the metadata consumer properties and setting the consumer properties from Properties.xml for kafka inputoperator not working added test case Removed trailing spaces Removed unused imports > The metadata kafka consumer should also pickup the properties setting on the > kafka input operator. > --- > > Key: APEXMALHAR-2241 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2241 > Project: Apache Apex Malhar > Issue Type: Bug >Reporter: Siyuan Hua >Assignee: Venkatesh Kottapalli > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #345: APEXMALHAR-2130 REVIEW ONLY (WindowedOperator...
Github user davidyan74 closed the pull request at: https://github.com/apache/apex-malhar/pull/345 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXMALHAR-2130) Scalable windowed storage
[ https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514686#comment-15514686 ] ASF GitHub Bot commented on APEXMALHAR-2130: Github user davidyan74 closed the pull request at: https://github.com/apache/apex-malhar/pull/345 > Scalable windowed storage > - > > Key: APEXMALHAR-2130 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2130 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: bright chen >Assignee: David Yan > Labels: roadmap > > This feature is used for supporting windowing. > The storage needs to have the following features: > 1. Spillable key value storage (integrate with APEXMALHAR-2026) > 2. Upon checkpoint, it saves a snapshot for the entire data set with the > checkpointing window id. This should be done incrementally (ManagedState) to > avoid wasting space with unchanged data > 3. When recovering, it takes the recovery window id and restores to that > snapshot > 4. When a window is committed, all windows with a lower ID should be purged > from the store. > 5. It should implement the WindowedStorage and WindowedKeyedStorage > interfaces, and because of 2 and 3, we may want to add methods to the > WindowedStorage interface so that the implementation of WindowedOperator can > notify the storage of checkpointing, recovering and committing of a window. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #424: APEXMALHAR-2130 Spillable implementation for ...
GitHub user davidyan74 opened a pull request: https://github.com/apache/apex-malhar/pull/424 APEXMALHAR-2130 Spillable implementation for WindowedOperator @tweise @brightchen @siyuanh Please review and merge You can merge this pull request into a Git repository by running: $ git pull https://github.com/davidyan74/apex-malhar windowedSpillable-PR Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/424.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #424 commit ede7f056919ba11c607b06d302153a7498e5e3e6 Author: David Yan Date: 2016-08-15T21:19:08Z APEXMALHAR-2130 Spillable implementation for WindowedOperator --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXMALHAR-2130) Scalable windowed storage
[ https://issues.apache.org/jira/browse/APEXMALHAR-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514690#comment-15514690 ] ASF GitHub Bot commented on APEXMALHAR-2130: GitHub user davidyan74 opened a pull request: https://github.com/apache/apex-malhar/pull/424 APEXMALHAR-2130 Spillable implementation for WindowedOperator @tweise @brightchen @siyuanh Please review and merge You can merge this pull request into a Git repository by running: $ git pull https://github.com/davidyan74/apex-malhar windowedSpillable-PR Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/424.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #424 commit ede7f056919ba11c607b06d302153a7498e5e3e6 Author: David Yan Date: 2016-08-15T21:19:08Z APEXMALHAR-2130 Spillable implementation for WindowedOperator > Scalable windowed storage > - > > Key: APEXMALHAR-2130 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2130 > Project: Apache Apex Malhar > Issue Type: Task >Reporter: bright chen >Assignee: David Yan > Labels: roadmap > > This feature is used for supporting windowing. > The storage needs to have the following features: > 1. Spillable key value storage (integrate with APEXMALHAR-2026) > 2. Upon checkpoint, it saves a snapshot for the entire data set with the > checkpointing window id. This should be done incrementally (ManagedState) to > avoid wasting space with unchanged data > 3. When recovering, it takes the recovery window id and restores to that > snapshot > 4. When a window is committed, all windows with a lower ID should be purged > from the store. > 5. It should implement the WindowedStorage and WindowedKeyedStorage > interfaces, and because of 2 and 3, we may want to add methods to the > WindowedStorage interface so that the implementation of WindowedOperator can > notify the storage of checkpointing, recovering and committing of a window. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #425: add jms input operator doc to malhar docs
GitHub user sanjaypujare opened a pull request: https://github.com/apache/apex-malhar/pull/425 add jms input operator doc to malhar docs @amberarrow closing PR https://github.com/DataTorrent/docs/pull/81 and opening this one. Contents are the same. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sanjaypujare/apex-malhar jms-input-operator-doc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/425.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #425 commit 97a5e62bd963d7bc45dddf93c982073e675f6ef4 Author: Sanjay Pujare Date: 2016-09-22T23:04:42Z add jms input operator doc to malhar docs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXCORE-528) Output Ports Not Optional by Default During Validation
[ https://issues.apache.org/jira/browse/APEXCORE-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514909#comment-15514909 ] Alex McCullough commented on APEXCORE-528: -- Yep, that works. My thought was that if the default value ever changed you wouldn't want to have to update this code too, but I suppose the reality of it is that's not likely to happen. > Output Ports Not Optional by Default During Validation > -- > > Key: APEXCORE-528 > URL: https://issues.apache.org/jira/browse/APEXCORE-528 > Project: Apache Apex Core > Issue Type: Bug >Affects Versions: 3.4.0 >Reporter: Alex McCullough >Assignee: Alex McCullough >Priority: Minor > > The 'optional' OutputPortFieldAnnotation states that the default value is > true. When you build a DAG with multiple output ports the validator throws an > error telling you at least one must be connected. To fix, you must explicitly > add the annotation to all output ports and set the value to True, which is > supposed to already be the default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-1818) Integrate Calcite to support SQL
[ https://issues.apache.org/jira/browse/APEXMALHAR-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514919#comment-15514919 ] Julian Hyde commented on APEXMALHAR-1818: - [~chinmay], I watched the demo video and took a look at your code, and it all looks fine. I suspect that you could get semantic stuff like view substitution with a low amount of effort. I know you're planning inner join (I presume stream-to-stream join?). It will be interesting when you also support streaming GROUP BY and streaming windowed aggregate functions. I started work on a streaming TCK in Calcite; see CALCITE-1114. You could potentially use that to verify that your implementation of streaming SQL is correct. > Integrate Calcite to support SQL > > > Key: APEXMALHAR-1818 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-1818 > Project: Apache Apex Malhar > Issue Type: New Feature > Components: query operators >Reporter: Amol >Assignee: Chinmay Kolhatkar > Labels: roadmap > > Once we have ability to generate a subdag, we should take a look at > integrating Calcite into Apex. The operator that enables populate DAG, should > use Calcite to generate the DAG, given a SQL query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXMALHAR-2264) Add documentation for jmsInputOperator
Sanjay M Pujare created APEXMALHAR-2264: --- Summary: Add documentation for jmsInputOperator Key: APEXMALHAR-2264 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2264 Project: Apache Apex Malhar Issue Type: Documentation Components: documentation Reporter: Sanjay M Pujare Assignee: Sanjay M Pujare Priority: Minor Add jmsInputOperator based on the latest enhancements made to that operator -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #425: APEXMALHAR-2264 add jms input operator doc to...
Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/425 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXMALHAR-2264) Add documentation for jmsInputOperator
[ https://issues.apache.org/jira/browse/APEXMALHAR-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515182#comment-15515182 ] ASF GitHub Bot commented on APEXMALHAR-2264: Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/425 > Add documentation for jmsInputOperator > -- > > Key: APEXMALHAR-2264 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2264 > Project: Apache Apex Malhar > Issue Type: Documentation > Components: documentation >Reporter: Sanjay M Pujare >Assignee: Sanjay M Pujare >Priority: Minor > > Add jmsInputOperator based on the latest enhancements made to that operator -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (APEXMALHAR-2264) Add documentation for jmsInputOperator
[ https://issues.apache.org/jira/browse/APEXMALHAR-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Munagala V. Ramanath resolved APEXMALHAR-2264. -- Resolution: Fixed Fix Version/s: 3.6.0 Merged > Add documentation for jmsInputOperator > -- > > Key: APEXMALHAR-2264 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2264 > Project: Apache Apex Malhar > Issue Type: Documentation > Components: documentation >Reporter: Sanjay M Pujare >Assignee: Sanjay M Pujare >Priority: Minor > Fix For: 3.6.0 > > > Add jmsInputOperator based on the latest enhancements made to that operator -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (APEXMALHAR-2265) Add entries to mkdocs.yml for recently added operator docs
Munagala V. Ramanath created APEXMALHAR-2265: Summary: Add entries to mkdocs.yml for recently added operator docs Key: APEXMALHAR-2265 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2265 Project: Apache Apex Malhar Issue Type: Bug Components: documentation Reporter: Munagala V. Ramanath Assignee: Munagala V. Ramanath Priority: Minor Add entries to mkdocs.yml for recently added operator docs -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-core pull request #396: APEXCORE-538 - Log raw data when RPC message fa...
Github user asfgit closed the pull request at: https://github.com/apache/apex-core/pull/396 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXCORE-538) Log raw data when RPC message fails to de-serialize
[ https://issues.apache.org/jira/browse/APEXCORE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515202#comment-15515202 ] ASF GitHub Bot commented on APEXCORE-538: - Github user asfgit closed the pull request at: https://github.com/apache/apex-core/pull/396 > Log raw data when RPC message fails to de-serialize > > > Key: APEXCORE-538 > URL: https://issues.apache.org/jira/browse/APEXCORE-538 > Project: Apache Apex Core > Issue Type: Improvement >Reporter: Vlad Rozov >Assignee: Vlad Rozov > Fix For: 3.3.1, 3.2.2, 3.5.0, 3.4.1 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (APEXMALHAR-2265) Add entries to mkdocs.yml for recently added operator docs
[ https://issues.apache.org/jira/browse/APEXMALHAR-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515218#comment-15515218 ] ASF GitHub Bot commented on APEXMALHAR-2265: GitHub user amberarrow opened a pull request: https://github.com/apache/apex-malhar/pull/426 APEXMALHAR-2265 Added entries in mkdocs.yml for 2 newly added docs @sashadt Please take a look You can merge this pull request into a Git repository by running: $ git pull https://github.com/amberarrow/incubator-apex-malhar APEXMALHAR-2265.add-mkdocs-entries Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/426.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #426 commit 2291d9a0dbca15a6f43c51d9611682269e9e6c0f Author: Munagala V. Ramanath Date: 2016-09-23T03:04:34Z APEXMALHAR-2265 Added entries in mkdocs.yml for 2 newly added docs > Add entries to mkdocs.yml for recently added operator docs > -- > > Key: APEXMALHAR-2265 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2265 > Project: Apache Apex Malhar > Issue Type: Bug > Components: documentation >Reporter: Munagala V. Ramanath >Assignee: Munagala V. Ramanath >Priority: Minor > > Add entries to mkdocs.yml for recently added operator docs -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #426: APEXMALHAR-2265 Added entries in mkdocs.yml f...
GitHub user amberarrow opened a pull request: https://github.com/apache/apex-malhar/pull/426 APEXMALHAR-2265 Added entries in mkdocs.yml for 2 newly added docs @sashadt Please take a look You can merge this pull request into a Git repository by running: $ git pull https://github.com/amberarrow/incubator-apex-malhar APEXMALHAR-2265.add-mkdocs-entries Alternatively you can review and apply these changes as the patch at: https://github.com/apache/apex-malhar/pull/426.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #426 commit 2291d9a0dbca15a6f43c51d9611682269e9e6c0f Author: Munagala V. Ramanath Date: 2016-09-23T03:04:34Z APEXMALHAR-2265 Added entries in mkdocs.yml for 2 newly added docs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (APEXCORE-538) Log raw data when RPC message fails to de-serialize
[ https://issues.apache.org/jira/browse/APEXCORE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise resolved APEXCORE-538. --- Resolution: Fixed > Log raw data when RPC message fails to de-serialize > > > Key: APEXCORE-538 > URL: https://issues.apache.org/jira/browse/APEXCORE-538 > Project: Apache Apex Core > Issue Type: Improvement >Reporter: Vlad Rozov >Assignee: Vlad Rozov > Fix For: 3.2.2, 3.5.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (APEXCORE-538) Log raw data when RPC message fails to de-serialize
[ https://issues.apache.org/jira/browse/APEXCORE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise updated APEXCORE-538: -- Fix Version/s: (was: 3.4.1) (was: 3.3.1) > Log raw data when RPC message fails to de-serialize > > > Key: APEXCORE-538 > URL: https://issues.apache.org/jira/browse/APEXCORE-538 > Project: Apache Apex Core > Issue Type: Improvement >Reporter: Vlad Rozov >Assignee: Vlad Rozov > Fix For: 3.2.2, 3.5.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (APEXMALHAR-2265) Add entries to mkdocs.yml for recently added operator docs
[ https://issues.apache.org/jira/browse/APEXMALHAR-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Weise resolved APEXMALHAR-2265. -- Resolution: Fixed Fix Version/s: 3.6.0 > Add entries to mkdocs.yml for recently added operator docs > -- > > Key: APEXMALHAR-2265 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2265 > Project: Apache Apex Malhar > Issue Type: Bug > Components: documentation >Reporter: Munagala V. Ramanath >Assignee: Munagala V. Ramanath >Priority: Minor > Fix For: 3.6.0 > > > Add entries to mkdocs.yml for recently added operator docs -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] apex-malhar pull request #426: APEXMALHAR-2265 Added entries in mkdocs.yml f...
Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/426 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (APEXMALHAR-2265) Add entries to mkdocs.yml for recently added operator docs
[ https://issues.apache.org/jira/browse/APEXMALHAR-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15515375#comment-15515375 ] ASF GitHub Bot commented on APEXMALHAR-2265: Github user asfgit closed the pull request at: https://github.com/apache/apex-malhar/pull/426 > Add entries to mkdocs.yml for recently added operator docs > -- > > Key: APEXMALHAR-2265 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2265 > Project: Apache Apex Malhar > Issue Type: Bug > Components: documentation >Reporter: Munagala V. Ramanath >Assignee: Munagala V. Ramanath >Priority: Minor > Fix For: 3.6.0 > > > Add entries to mkdocs.yml for recently added operator docs -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Trying to acces live data
Hi team, Trying to access live data from couchbase using input Operators. It is running fine on my local. The tuples are emitting as Pojo through the output port of inputOperator. But when I upload the .apa file on data torrent, it gets launched but soon it goes into Failed State just after been accepted. Can you please help me to resolve the issue ? I am sending logs as an attached file. Note:- I am not using built in operators. I have made my own operators which are quite similar to CouchBasePojoInputOperator and CouchBaseStore. Regards, Hitesh Goyal Simpli5d Technologies Cont No.: 9996588220 2016-09-22 12:40:30,308 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: Updating application attempt appattempt_1474540741175_0001_01 with final state: FAILED, and exit status: 1 2016-09-22 12:40:30,308 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1474540741175_0001_01 State change from LAUNCHED to FINAL_SAVING 2016-09-22 12:40:30,308 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Unregistering app attempt : appattempt_1474540741175_0001_01 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.security.AMRMTokenSecretManager: Application finished, removing password for appattempt_1474540741175_0001_01 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1474540741175_0001_01 State change from FINAL_SAVING to FAILED 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 1. The max attempts is 2 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application Attempt appattempt_1474540741175_0001_01 is done. finalState=FAILED 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: Registering app attempt : appattempt_1474540741175_0001_02 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1474540741175_0001_02 State change from NEW to SUBMITTED 2016-09-22 12:40:30,309 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1474540741175_0001 requests cleared 2016-09-22 12:40:30,310 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1474540741175_0001 user: dtadmin queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0 2016-09-22 12:40:30,310 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application application_1474540741175_0001 from user: dtadmin activated in queue: default 2016-09-22 12:40:30,310 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1474540741175_0001 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@77d0922b, leaf-queue: default #user-pending-applications: 0 #user-active-applications: 1 #queue-pending-applications: 0 #queue-active-applications: 1 2016-09-22 12:40:30,310 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Added Application Attempt appattempt_1474540741175_0001_02 to scheduler from user dtadmin in queue default 2016-09-22 12:40:30,310 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1474540741175_0001_02 State change from SUBMITTED to SCHEDULED 2016-09-22 12:40:31,308 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed... 2016-09-22 12:40:31,308 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1474540741175_0001_02_01 Container Transitioned from NEW to ALLOCATED 2016-09-22 12:40:31,308 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=dtadmin OPERATION=AM Allocated ContainerTARGET=SchedulerApp RESULT=SUCCESS APPID=application_1474540741175_0001 CONTAINERID=container_1474540741175_0001_02_01 2016-09-22 12:40:31,308 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned container container_1474540741175_0001_02_01 of capacity on host localhost.localdomain:35189, which has 1 containers, used and available after allocation 2016-09-22 12:40:31,308 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: assignedContainer application attempt=appattempt_1474540741175_0001_02 container=Container: [ContainerId: container_1474540741175_0001_02_01, NodeId: localhost.localdomain:35189, NodeHttpAddress: localhost.localdomain:8042, Resourc
Re: Trying to acces live data
Hi Hitesh, Can you collect all logs including app master logs and send it? Run command "yarn logs -applicationId " on cluster to get logs for an application. Also please collect stderr output logs of appMaster. You can find them under "/logs/userlogs///" -Priyanka On Fri, Sep 23, 2016 at 10:55 AM, Hitesh Goyal wrote: > Hi team, > > Trying to access live data from couchbase using input Operators. > > It is running fine on my local. The tuples are emitting as Pojo through > the output port of inputOperator. > > But when I upload the .apa file on data torrent, it gets launched but soon > it goes into Failed State just after been accepted. > > Can you please help me to resolve the issue ? I am sending logs as an > attached file. > > Note:- I am not using built in operators. I have made my own operators > which are quite similar to CouchBasePojoInputOperator and CouchBaseStore. > > > > Regards, > > *Hitesh Goyal* > > Simpli5d Technologies > > Cont No.: 9996588220 > > >