High CPU % even after increasing VCORES

2018-05-01 Thread Vivek Bhide
Hi, I have created a new operator which converts avro message to json message by extending base operator. I'm unable to use AvroToPojoOperator due to the complexity of avro structure. I'm trying to analyze its performance. When I use: VCORES MEMORY Tuples Emitted

Re: TimeBasedDedupOperator failing with NullPointer

2018-03-23 Thread Vivek Bhide
Hi Vlad, I have filed a JIRA https://issues.apache.org/jira/browse/APEXMALHAR-2557 but I am not sure about all the attributes like component or epic or Label. Please update the ticket with correct values or let me know and I will add that Regards Vivek -- Sent from: http://apache-apex-users-l

Re: TimeBasedDedupOperator failing with NullPointer

2018-03-23 Thread Vivek Bhide
Hi Bhupesh, Please find the details below Dedup operator configuration : dt.application.KafkaDedupOutputOperatorTest.operator.dedupeOperator.prop.keyExpression {$.id} + {$.name} dt.application.KafkaDedupOutputOperatorTest.opera

Re: TimeBasedDedupOperator failing with NullPointer

2018-03-23 Thread Vivek Bhide
Hi Vlad, Yes, its 3.8.0 Regards Vivek -- Sent from: http://apache-apex-users-list.78494.x6.nabble.com/

TimeBasedDedupOperator failing with NullPointer

2018-03-22 Thread Vivek Bhide
Hi All, While using the TimeBasedDedupOperator for deduping, I see that operator keeps failing with below NullPointer exception. I don't know if it has anything to do with configuration. I also see that operator is always high on CPU usages. Almost reaching 100%. I tried increasing vcores and als

Re: Need help on achieving end to end exactly once with KafkaIn and KafakOut

2018-03-21 Thread Vivek Bhide
Thanks Sandesh for confirmation.. Can you point us to updated version of this Output operator? Regards Vivek -- Sent from: http://apache-apex-users-list.78494.x6.nabble.com/

Re: Need help on achieving end to end exactly once with KafkaIn and KafakOut

2018-03-21 Thread Vivek Bhide
Can someone please confirm our findings? It very critical for us to solve this issue since the whole pipeline's functionality is at stake Regards Vivek -- Sent from: http://apache-apex-users-list.78494.x6.nabble.com/

Re: Need help on achieving end to end exactly once with KafkaIn and KafakOut

2018-03-20 Thread Vivek Bhide
Hi Pramod, We did some more research by adding more logging to the KafkaInput operator and below are our findings. Application Setup: 1. WindowDataManager for KafkaInputOperator is set FSWindowDataManager 2. Streaming window for application is set to 5 seconds from 0.5 seconds for easily reprodu

Re: Need help on TimeBasedDedupOperator

2018-03-16 Thread Vivek Bhide
Thanks Bhupesh. Regards Vivek -- Sent from: http://apache-apex-users-list.78494.x6.nabble.com/

Re: Need help on achieving end to end exactly once with KafkaIn and KafakOut

2018-03-16 Thread Vivek Bhide
Hi Pramod, I tried it but I am not getting consistent results. It worked few times but then again failed with same error. Is there anything else that you will recommend to validate? Regards Vivek -- Sent from: http://apache-apex-users-list.78494.x6.nabble.com/

Need help on achieving end to end exactly once with KafkaIn and KafakOut

2018-03-15 Thread Vivek Bhide
This is a continuation of my previous question about KafkaSinglePortExactlyOnceOutputOperator. I am trying a to achieve end to end exactly once processing with data receiving from one Kafka topic and finally posting it to another Kafka topic. Below are three things needed, as Pramod mentioned in on

Need help on TimeBasedDedupOperator

2018-03-15 Thread Vivek Bhide
Hi, I'm trying to understand working of TimeBasedDedupOperator for my streaming application. I'm using the example shown in Malhar dedup example: https://github.com/apache/apex-malhar/blob/master/examples/dedup/src/main/java/org/apache/apex/examples/dedup/Application.java I made few modification

Re: KafkaSinglePortExactlyOnceOutputOperator throwing exception about violation about Exactly once

2018-03-15 Thread Vivek Bhide
Thanks Sandesh.. It helped Regards Vivek -- Sent from: http://apache-apex-users-list.78494.x6.nabble.com/

Re: KafkaSinglePortExactlyOnceOutputOperator throwing exception about violation about Exactly once

2018-03-14 Thread Vivek Bhide
Exception stack trace is below 2018-03-13 17:00:53,219 INFO com.datatorrent.stram.StreamingContainerManager: Container container_1520983362676_0002_01_07 buffer server: 80e65028f6a8:60970 2018-03-13 17:00:54,811 INFO com.datatorrent.stram.StreamingContainerParent: child msg: Stopped running du

KafkaSinglePortExactlyOnceOutputOperator throwing exception about violation about Exactly once

2018-03-14 Thread Vivek Bhide
Hi, I was testing the KafkaSinglePortExactlyOnceOutputOperator and I see below error in case of operator fails. This is not reproducible every time but occurs more than 50% of the time. I tried testing it with single operator instance, with multiple partitions and parallel partitions but i couldn'

Re: How to specify more than one fields as a dedup keyExpression

2017-10-26 Thread Vivek Bhide
Thanks Ram.. so are you saying that more than one integer fields in dedup key will calculate the sum of the two fields where in terms of strings it will concatenate them (because of + overloading)? Regards Vivek -- Sent from: http://apache-apex-users-list.78494.x6.nabble.com/

Re: How to specify more than one fields as a dedup keyExpression

2017-10-25 Thread Vivek Bhide
Thank you everyone for reply. Solution that chinmay suggested is working but then I see one more discrepancy. After adding more that 1 fields as a dedup key, my expectation was to have the dedup decision made on combination of these 2 keys. I did run the test case multiple times with BoundedDedup

Re: How to specify more than one fields as a dedup keyExpression

2017-10-23 Thread Vivek Bhide
Thanks Ram for your suggestions Field types that I am trying are the basic primitive types. In fact, I was just playing around with the dedup examples that is available in malhar git. I just added one more field id1 with getter and setters to 'TestEvent' class from testcase and want to try dedup o

How to specify more than one fields as a dedup keyExpression

2017-10-23 Thread Vivek Bhide
Hi, I want to specify more than 1 attributes of a tuple in keyExpression for BoundedDedupOperator. I tried putting it in multiple ways but it doesn't work Does BoundedDedup or TimeDedup supports multiple fields while deduping? if yes then how to specify them in properties.xml Regards Vivek --

Recommended way to query hive and capture output

2017-10-23 Thread Vivek Bhide
Hi I would like to know if there is any recommended way to execute queries on hive tables (not inserts ), capture the query output and process it? Regards Vivek -- Sent from: http://apache-apex-users-list.78494.x6.nabble.com/

Re: Malhar input operator for connecting to Teradata

2017-08-30 Thread Vivek Bhide
Hi Thomas, I could get the poller operator working for teradata with few of tweaks to your workaround and from your latest code base from PR Thanks again Regards Vivek -- Sent from: http://apache-apex-users-list.78494.x6.nabble.com/

Re: Malhar input operator for connecting to Teradata

2017-08-28 Thread Vivek Bhide
Thanks Thomas for workaround and heads up about the bugs :) Let me see if I can get this working in my usecase Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/Malhar-input-operator-for-connecting-to-Teradata-tp1843p1845.html Sent from the Apac

Malhar input operator for connecting to Teradata

2017-08-28 Thread Vivek Bhide
Hi, I would like to know if there is any input operator available to connect and read data to Teradata? I tried using the JdbcPOJOPollInputOperator but the queries it forms are not as per teradata syntax and hence it fails Regards Vivek -- View this message in context: http://apache-apex-user

Re: How the application recovery works when its started with -originalAppId

2017-08-14 Thread Vivek Bhide
Thanks a lot Pramod for going an extra mile and providing solution on this issue. The explanation totally makes sense and I will keep this in mind during any future development Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/How-the-application-

Re: How the application recovery works when its started with -originalAppId

2017-08-11 Thread Vivek Bhide
Hi Pramod and Thomas Below are my findings till now on this issue 1. Fix suggested by Pramod and fix made as apart of https://issues.apache.org/jira/browse/APEXMALHAR-2526 are doing the same thing 2. In the comments for https://issues.apache.org/jira/browse/APEXMALHAR-2526 I found that, the new c

Re: How the application recovery works when its started with -originalAppId

2017-08-11 Thread Vivek Bhide
It is present in class.. just after setCapacity method -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/How-the-application-recovery-works-when-its-started-with-originalAppId-tp1821p1838.html Sent from the Apache Apex Users list mailing list archive at Nabble.c

Re: How the application recovery works when its started with -originalAppId

2017-08-11 Thread Vivek Bhide
Please refer the LRUCache class from the code base I pasted above. Its exactly what I m using Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/How-the-application-recovery-works-when-its-started-with-originalAppId-tp1821p1836.html Sent from the A

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Vivek Bhide
Thank You Pramod and Thomas for all your inputs. Hi Pramod, Jira https://issues.apache.org/jira/browse/APEXMALHAR-2526 that Thomas referred seem to be the one inline with what you suggested as a possible solution. I see there is new class KryoJavaSerializer.java (new in malhar and not present wit

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Vivek Bhide
Thanks Pramod.. This seems to have done trick.. I will check again when I have some data to process to see if that goes well with it. I am quite confident that it will Just curious, Is this the best way to handle this issue or if there is any other elegant way it can be addressed? Regards Vivek

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Vivek Bhide
I have already pasted the stack trace in my original post. Also can you please confirm what is the classpath kryo is using v/s default Java serializer and where exactly is set for kryo? Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/How-the-app

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Vivek Bhide
Hi Pramod, As I told we have LRUCache (LinkedHashMap of ) which needs to be serialized and it is initialized in operator constructor. What we found that, when operator is serialized for checkpointing the content of the this LRUcache is not getting serialized and instead its just an empty LinkedHas

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Vivek Bhide
Hi Pramod, I get this error even when I try to resubmit the exact same apa Is there any other angle of this problem that i should look for? Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/How-the-application-recovery-works-when-its-started-with

How the application recovery works when its started with -originalAppId

2017-08-10 Thread Vivek Bhide
Hi All, I have implemented the LRUCache in one of the operators and this cache is not Transient. This LRUcache is a simple extension of LinkedHashMap with its removeEldestEntry method overriden. What we found is, the default Kryo serializer, that Apex uses for checkpointing, doesn't work properly

Re: data missing in AbstractFileOutPutOperator

2017-08-10 Thread Vivek Bhide
Hi Chiru, Have you tried waiting till the time your output file in HDFS rolls over (from .tmp to .0)? I have observed this in our case that if you query .tmp file, it may not show all the records written to it. The reason could be that the file is still eligible to be written to and output operato

Re: Apache Apex with Apache slider

2017-08-07 Thread Vivek Bhide
Thanks you guys for you all your responses. With all your inputs and a sample implementation of slider within my organization, I was able to achieve the Apex streaming app monitoring and automatic restart through slider Regards Vivek -- View this message in context: http://apache-apex-users-li

Re: Apache Apex with Apache slider

2017-08-01 Thread Vivek Bhide
Yes Thomas thats exactly what we are looking for. But using slider is a bit trickier since slider manages the deployment of the application you give it to it. Also as far as I have read, you need to have your whole project structure as sliders expectation so that it can mange it by moving the code

Re: Apache Apex with Apache slider

2017-08-01 Thread Vivek Bhide
Hi Pramod, Main reason we are looking for marrying Apex with Slider is to make use of slider's ability of monitoring and restarting an application in case of failure. We have seen couple of times that Apex streaming application got killed because of some sporadic issues with cluster and there was

Apache Apex with Apache slider

2017-08-01 Thread Vivek Bhide
Hi All, Is there any resources for reference on implementing Apache Apex with Apache slider? Please let me know Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/Apache-Apex-with-Apache-slider-tp1802.html Sent from the Apache Apex Users list mail

Re: How to address unclean undeploy exception

2017-07-06 Thread Vivek Bhide
Below is the stacktrace. Also Can you please point me to some sample examples for operator which are using managed state. Or general guidelines on how to use it Regards Vivek 2017-07-06 18:02:05,006 ERROR engine.StreamingContainer (StreamingContainer.java:run(1456)) - Operator set [OperatorDeploy

How to address unclean undeploy exception

2017-07-05 Thread Vivek Bhide
Hi, In one of the operators, I have some big LRUcache objects (which are not transient and hence checkpointed) and when that operator restarts for any reason, I see the 'unclean undeploy' exception in container logs. Unfortunately I don't have the stack trace with me but is there any configuratio

How to set affinity between default unifier and its downstream operator

2017-06-29 Thread Vivek Bhide
Hi My application is connecting to Kafka topic with 2 partitions and there are corresponding KafkaSinglePortInputOperator instances. Since there are more than 1 instances, as per apex functionality, application has a default unifier as a next downstream operator when deployed. Problem is unifier o

Re: BoundedDedupOperator failing with java.lang.IllegalArgumentException: bucket conflict

2017-06-18 Thread Vivek Bhide
When can we expect 3.8.0 to be published to maven? Latest Malhar version in maven is still 3.7.0 Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/BoundedDedupOperator-failing-with-java-lang-IllegalArgumentException-bucket-conflict-tp1698p1742.htm

What is recommended way to achieve exactly once tuple processing in case of operator failure scenario

2017-06-17 Thread Vivek Bhide
After any of the operators fails during processing, it always recovers from the last checkpointed state. So it will reprocess all the tuples which were processed before failure but not checkpointed. What is a recommended way to to avoid this from happening? Is there any setting in Apex that enables

Re: BoundedDedupOperator failing with java.lang.IllegalArgumentException: bucket conflict

2017-06-12 Thread Vivek Bhide
Sure. Will check and let you know Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/BoundedDedupOperator-failing-with-java-lang-IllegalArgumentException-bucket-conflict-tp1698p1720.html Sent from the Apache Apex Users list mailing list archive at

Re: Same application with different name

2017-06-12 Thread Vivek Bhide
That works like a charm !! Thank you -Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/Same-application-with-different-name-tp1708p1715.html Sent from the Apache Apex Users list mailing list archive at Nabble.com.

Re: Same application with different name

2017-06-12 Thread Vivek Bhide
Hi Sandesh, I tried below commands launch -apconf app-properties.xml -D DAGContext.APPLICATION_NAME="" launch -apconf app-properties.xml -D APPLICATION_NAME="" The apex version I am running with is 3.6.0 Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x

Same application with different name

2017-06-11 Thread Vivek Bhide
We have a requirement where we might need to submit another instance of the same apex application with different configuration in same environment. Since Apex doesn't allow concurrent executions of the same application, is it possible to submit the same application with different name at the runtim

Re: BoundedDedupOperator failing with java.lang.IllegalArgumentException: bucket conflict

2017-06-09 Thread Vivek Bhide
Hi Bhupesh, I even tried using the TimeBoundedDedupe instead of BoundedDedup and even that one fails with exception. In this case, the container starts properly but as soon as it tries to process the tuples it fails. Below are configurations dt.application.DataUsageIngest.o

Re: BoundedDedupOperator failing with java.lang.IllegalArgumentException: bucket conflict

2017-06-09 Thread Vivek Bhide
Hi Bhupesh, The exception occurred immediately. Rather operator didn't even initialized completely and failed before that causing all further operators to stuck in pending deploy state. Regarding properties file, as i said, there was no configuration with numBucket in config file and only config f

BoundedDedupOperator failing with java.lang.IllegalArgumentException: bucket conflict

2017-06-08 Thread Vivek Bhide
Hi, I am using the BoundedDedupOperator and with default value of numBuckets (46340) the container is failing with below bucket conflict exception 2017-06-08 17:52:10,140 INFO stram.StreamingContainerParent (StreamingContainerParent.java:log(170)) - child msg: Stopped running due to an exception

Container launch failing with ExitCodeException exitCode=1

2017-06-08 Thread Vivek Bhide
Hi All, There is a random container failure happening with below error. The application was running fine and it satrted failing only after adding the BoundedDedupOperator and a user defined checksum operator that calcuates the checksum of the message received from kafka and sends it to dedupe port

Re: AvroToPojo Operator doesn't recover after failure and keeps throwing Kryo exception

2017-05-31 Thread Vivek Bhide
Hi Sandesh. this worked as expected. Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/AvroToPojo-Operator-doesn-t-recover-after-failure-and-keeps-throwing-Kryo-exception-tp1660p1667.html Sent from the Apache Apex Users list mailing list archive a

Re: AvroToPojo Operator doesn't recover after failure and keeps throwing Kryo exception

2017-05-30 Thread Vivek Bhide
Thanks a lot Sandesh.. This problem was bugging for quite some time. I will try these to see if it resolves the problem -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/AvroToPojo-Operator-doesn-t-recover-after-failure-and-keeps-throwing-Kryo-exception-tp1660p1

Re: Exceeded allowed memory block for AvroFileInputOperator

2017-05-30 Thread Vivek Bhide
Thanks you All.. I will try to implement the suggestions Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble.com/Exceeded-allowed-memory-block-for-AvroFileInputOperator-tp1651p1661.html Sent from the Apache Apex Users list mailing list archive at Nabble

AvroToPojo Operator doesn't recover after failure and keeps throwing Kryo exception

2017-05-30 Thread Vivek Bhide
I am using the AvroToPojo Malhar operator in conjunction with AvroFileInputOperator for converting the avro records to POJO. While doing the testing for application's stability, I found that AvroToPojo opwerator doesn't recover in case of failure and keeps throwing below exception. This in turn mak

Re: Exceeded allowed memory block for AvroFileInputOperator

2017-05-30 Thread Vivek Bhide
Thank you.. I understand that partitioning is one of the options but is there a way to decrease the emit rate? Reason I am interested in this option in the upstream operator is fileReader operator and down stream operator is filter operator. I tries partitioning the filter operator but still I see

Exceeded allowed memory block for AvroFileInputOperator

2017-05-26 Thread Vivek Bhide
I have a AvroFileInputOperator which has large no of files to be read when the application starts. While the fire reading goes well at the start, I start getting below warning in log after reading around 50+ files Is there something to be worried about and is there a way to mitigate it? 2017-05-2

Re: NullPointerException at AbstractFSRollingOutputOperator while using HiveOutputModule

2017-05-24 Thread Vivek Bhide
Thank you Sanjay for your reply. I am using Malhar 3.7.0 but still facing the issue. I will reevaluate and recreate the issue with all the logs to support my findings before I report any of the issues. Regards Vivek -- View this message in context: http://apache-apex-users-list.78494.x6.nabble

Re: NullPointerException at AbstractFSRollingOutputOperator while using HiveOutputModule

2017-05-20 Thread Vivek Bhide
Hi Sanjay, After working for some more time i could find a pattern on how and when the code breaks but for sure in any situation it doesn't work. Below are my oservations till now 1. Regarding setting a parameter, as you said, the app_name is optional and reason is you don't expect to have more t

Re: NullPointerException at AbstractFSRollingOutputOperator while using HiveOutputModule

2017-05-17 Thread Vivek Bhide
Hi Sanjay, I did all required changes but application is still throwing the same error. I increased the value to 1 (default 100) and even removed the setting of upstream operator's streaming window customization dt.application..operator.fsRolling.prop.maxWindowsWithNoDat

Re: NullPointerException at AbstractFSRollingOutputOperator while using HiveOutputModule

2017-05-17 Thread Vivek Bhide
Thank you Sanjay. I did go through the code and found the value you mentioned but was not sure if I should override the value. Regarding no data being sent to the HiveOutputModule for a number of windows, it could be a case since the upstream operator to hive module has the streaming window interv

Re: HiveOutputModule creating extra directories, than specified, while saving data into HDFS

2017-05-17 Thread Vivek Bhide
Hi Sanjay I waited for the application to rollover to nest file but as soon as the file reaches the size I defined, the operator started failing with below error Any suggestion on this error? FYI. I override the file size into my application properties to 50 MB from defalut 128MB 2017-05-17 20: