Hi,
I have created a new operator which converts avro message to json message by
extending base operator. I'm unable to use AvroToPojoOperator due to the
complexity of avro structure. I'm trying to analyze its performance.
When I use:
VCORES MEMORY Tuples Emitted
Hi Vlad,
I have filed a JIRA https://issues.apache.org/jira/browse/APEXMALHAR-2557
but I am not sure about all the attributes like component or epic or Label.
Please update the ticket with correct values or let me know and I will add
that
Regards
Vivek
--
Sent from: http://apache-apex-users-l
Hi Bhupesh,
Please find the details below
Dedup operator configuration :
dt.application.KafkaDedupOutputOperatorTest.operator.dedupeOperator.prop.keyExpression
{$.id} + {$.name}
dt.application.KafkaDedupOutputOperatorTest.opera
Hi Vlad,
Yes, its 3.8.0
Regards
Vivek
--
Sent from: http://apache-apex-users-list.78494.x6.nabble.com/
Hi All,
While using the TimeBasedDedupOperator for deduping, I see that operator
keeps failing with below NullPointer exception. I don't know if it has
anything to do with configuration.
I also see that operator is always high on CPU usages. Almost reaching 100%.
I tried increasing vcores and als
Thanks Sandesh for confirmation.. Can you point us to updated version of this
Output operator?
Regards
Vivek
--
Sent from: http://apache-apex-users-list.78494.x6.nabble.com/
Can someone please confirm our findings? It very critical for us to solve
this issue since the whole pipeline's functionality is at stake
Regards
Vivek
--
Sent from: http://apache-apex-users-list.78494.x6.nabble.com/
Hi Pramod,
We did some more research by adding more logging to the KafkaInput operator
and below are our findings.
Application Setup:
1. WindowDataManager for KafkaInputOperator is set FSWindowDataManager
2. Streaming window for application is set to 5 seconds from 0.5 seconds for
easily reprodu
Thanks Bhupesh.
Regards
Vivek
--
Sent from: http://apache-apex-users-list.78494.x6.nabble.com/
Hi Pramod,
I tried it but I am not getting consistent results. It worked few times but
then again failed with same error. Is there anything else that you will
recommend to validate?
Regards
Vivek
--
Sent from: http://apache-apex-users-list.78494.x6.nabble.com/
This is a continuation of my previous question about
KafkaSinglePortExactlyOnceOutputOperator. I am trying a to achieve end to
end exactly once processing with data receiving from one Kafka topic and
finally posting it to another Kafka topic. Below are three things needed, as
Pramod mentioned in on
Hi,
I'm trying to understand working of TimeBasedDedupOperator for my streaming
application. I'm using the example shown in Malhar dedup example:
https://github.com/apache/apex-malhar/blob/master/examples/dedup/src/main/java/org/apache/apex/examples/dedup/Application.java
I made few modification
Thanks Sandesh.. It helped
Regards
Vivek
--
Sent from: http://apache-apex-users-list.78494.x6.nabble.com/
Exception stack trace is below
2018-03-13 17:00:53,219 INFO
com.datatorrent.stram.StreamingContainerManager: Container
container_1520983362676_0002_01_07 buffer server: 80e65028f6a8:60970
2018-03-13 17:00:54,811 INFO com.datatorrent.stram.StreamingContainerParent:
child msg: Stopped running du
Hi,
I was testing the KafkaSinglePortExactlyOnceOutputOperator and I see below
error in case of operator fails. This is not reproducible every time but
occurs more than 50% of the time. I tried testing it with single operator
instance, with multiple partitions and parallel partitions but i couldn'
Thanks Ram.. so are you saying that more than one integer fields in dedup key
will calculate the sum of the two fields where in terms of strings it will
concatenate them (because of + overloading)?
Regards
Vivek
--
Sent from: http://apache-apex-users-list.78494.x6.nabble.com/
Thank you everyone for reply. Solution that chinmay suggested is working but
then I see one more discrepancy.
After adding more that 1 fields as a dedup key, my expectation was to have
the dedup decision made on combination of these 2 keys. I did run the test
case multiple times with BoundedDedup
Thanks Ram for your suggestions
Field types that I am trying are the basic primitive types. In fact, I was
just playing around with the dedup examples that is available in malhar git.
I just added one more field id1 with getter and setters to 'TestEvent' class
from testcase and want to try dedup o
Hi,
I want to specify more than 1 attributes of a tuple in keyExpression for
BoundedDedupOperator. I tried putting it in multiple ways but it doesn't
work
Does BoundedDedup or TimeDedup supports multiple fields while deduping? if
yes then how to specify them in properties.xml
Regards
Vivek
--
Hi
I would like to know if there is any recommended way to execute queries on
hive tables (not inserts ), capture the query output and process it?
Regards
Vivek
--
Sent from: http://apache-apex-users-list.78494.x6.nabble.com/
Hi Thomas,
I could get the poller operator working for teradata with few of tweaks to
your workaround and from your latest code base from PR
Thanks again
Regards
Vivek
--
Sent from: http://apache-apex-users-list.78494.x6.nabble.com/
Thanks Thomas for workaround and heads up about the bugs :)
Let me see if I can get this working in my usecase
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/Malhar-input-operator-for-connecting-to-Teradata-tp1843p1845.html
Sent from the Apac
Hi,
I would like to know if there is any input operator available to connect and
read data to Teradata? I tried using the JdbcPOJOPollInputOperator but the
queries it forms are not as per teradata syntax and hence it fails
Regards
Vivek
--
View this message in context:
http://apache-apex-user
Thanks a lot Pramod for going an extra mile and providing solution on this
issue. The explanation totally makes sense and I will keep this in mind
during any future development
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/How-the-application-
Hi Pramod and Thomas
Below are my findings till now on this issue
1. Fix suggested by Pramod and fix made as apart of
https://issues.apache.org/jira/browse/APEXMALHAR-2526 are doing the same
thing
2. In the comments for https://issues.apache.org/jira/browse/APEXMALHAR-2526
I found that, the new c
It is present in class.. just after setCapacity method
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/How-the-application-recovery-works-when-its-started-with-originalAppId-tp1821p1838.html
Sent from the Apache Apex Users list mailing list archive at Nabble.c
Please refer the LRUCache class from the code base I pasted above. Its
exactly what I m using
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/How-the-application-recovery-works-when-its-started-with-originalAppId-tp1821p1836.html
Sent from the A
Thank You Pramod and Thomas for all your inputs.
Hi Pramod,
Jira https://issues.apache.org/jira/browse/APEXMALHAR-2526 that Thomas
referred seem to be the one inline with what you suggested as a possible
solution. I see there is new class KryoJavaSerializer.java (new in malhar
and not present wit
Thanks Pramod.. This seems to have done trick.. I will check again when I
have some data to process to see if that goes well with it. I am quite
confident that it will
Just curious, Is this the best way to handle this issue or if there is any
other elegant way it can be addressed?
Regards
Vivek
I have already pasted the stack trace in my original post. Also can you
please confirm what is the classpath kryo is using v/s default Java
serializer and where exactly is set for kryo?
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/How-the-app
Hi Pramod,
As I told we have LRUCache (LinkedHashMap of ) which needs to
be serialized and it is initialized in operator constructor. What we found
that, when operator is serialized for checkpointing the content of the this
LRUcache is not getting serialized and instead its just an empty
LinkedHas
Hi Pramod,
I get this error even when I try to resubmit the exact same apa Is there any
other angle of this problem that i should look for?
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/How-the-application-recovery-works-when-its-started-with
Hi All,
I have implemented the LRUCache in one of the operators and this cache is
not Transient. This LRUcache is a simple extension of LinkedHashMap with its
removeEldestEntry method overriden. What we found is, the default Kryo
serializer, that Apex uses for checkpointing, doesn't work properly
Hi Chiru,
Have you tried waiting till the time your output file in HDFS rolls over
(from .tmp to .0)? I have observed this in our case that if you query .tmp
file, it may not show all the records written to it. The reason could be
that the file is still eligible to be written to and output operato
Thanks you guys for you all your responses. With all your inputs and a sample
implementation of slider within my organization, I was able to achieve the
Apex streaming app monitoring and automatic restart through slider
Regards
Vivek
--
View this message in context:
http://apache-apex-users-li
Yes Thomas thats exactly what we are looking for. But using slider is a bit
trickier since slider manages the deployment of the application you give it
to it. Also as far as I have read, you need to have your whole project
structure as sliders expectation so that it can mange it by moving the code
Hi Pramod,
Main reason we are looking for marrying Apex with Slider is to make use of
slider's ability of monitoring and restarting an application in case of
failure. We have seen couple of times that Apex streaming application got
killed because of some sporadic issues with cluster and there was
Hi All,
Is there any resources for reference on implementing Apache Apex with Apache
slider? Please let me know
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/Apache-Apex-with-Apache-slider-tp1802.html
Sent from the Apache Apex Users list mail
Below is the stacktrace. Also Can you please point me to some sample examples
for operator which are using managed state. Or general guidelines on how to
use it
Regards
Vivek
2017-07-06 18:02:05,006 ERROR engine.StreamingContainer
(StreamingContainer.java:run(1456)) - Operator set
[OperatorDeploy
Hi,
In one of the operators, I have some big LRUcache objects (which are not
transient and hence checkpointed) and when that operator restarts for any
reason, I see the 'unclean undeploy' exception in container logs.
Unfortunately I don't have the stack trace with me but is there any
configuratio
Hi
My application is connecting to Kafka topic with 2 partitions and there are
corresponding KafkaSinglePortInputOperator instances. Since there are more
than 1 instances, as per apex functionality, application has a default
unifier as a next downstream operator when deployed. Problem is unifier
o
When can we expect 3.8.0 to be published to maven? Latest Malhar version in
maven is still 3.7.0
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/BoundedDedupOperator-failing-with-java-lang-IllegalArgumentException-bucket-conflict-tp1698p1742.htm
After any of the operators fails during processing, it always recovers from
the last checkpointed state. So it will reprocess all the tuples which were
processed before failure but not checkpointed. What is a recommended way to
to avoid this from happening? Is there any setting in Apex that enables
Sure. Will check and let you know
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/BoundedDedupOperator-failing-with-java-lang-IllegalArgumentException-bucket-conflict-tp1698p1720.html
Sent from the Apache Apex Users list mailing list archive at
That works like a charm !! Thank you
-Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/Same-application-with-different-name-tp1708p1715.html
Sent from the Apache Apex Users list mailing list archive at Nabble.com.
Hi Sandesh,
I tried below commands
launch -apconf app-properties.xml -D
DAGContext.APPLICATION_NAME=""
launch -apconf app-properties.xml -D
APPLICATION_NAME=""
The apex version I am running with is 3.6.0
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x
We have a requirement where we might need to submit another instance of the
same apex application with different configuration in same environment.
Since Apex doesn't allow concurrent executions of the same application, is
it possible to submit the same application with different name at the
runtim
Hi Bhupesh,
I even tried using the TimeBoundedDedupe instead of BoundedDedup and even
that one fails with exception. In this case, the container starts properly
but as soon as it tries to process the tuples it fails.
Below are configurations
dt.application.DataUsageIngest.o
Hi Bhupesh,
The exception occurred immediately. Rather operator didn't even initialized
completely and failed before that causing all further operators to stuck in
pending deploy state. Regarding properties file, as i said, there was no
configuration with numBucket in config file and only config f
Hi,
I am using the BoundedDedupOperator and with default value of numBuckets
(46340) the container is failing with below bucket conflict exception
2017-06-08 17:52:10,140 INFO stram.StreamingContainerParent
(StreamingContainerParent.java:log(170)) - child msg: Stopped running due to
an exception
Hi All,
There is a random container failure happening with below error. The
application was running fine and it satrted failing only after adding the
BoundedDedupOperator and a user defined checksum operator that calcuates the
checksum of the message received from kafka and sends it to dedupe port
Hi Sandesh. this worked as expected.
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/AvroToPojo-Operator-doesn-t-recover-after-failure-and-keeps-throwing-Kryo-exception-tp1660p1667.html
Sent from the Apache Apex Users list mailing list archive a
Thanks a lot Sandesh.. This problem was bugging for quite some time. I will
try these to see if it resolves the problem
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/AvroToPojo-Operator-doesn-t-recover-after-failure-and-keeps-throwing-Kryo-exception-tp1660p1
Thanks you All.. I will try to implement the suggestions
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble.com/Exceeded-allowed-memory-block-for-AvroFileInputOperator-tp1651p1661.html
Sent from the Apache Apex Users list mailing list archive at Nabble
I am using the AvroToPojo Malhar operator in conjunction with
AvroFileInputOperator for converting the avro records to POJO. While doing
the testing for application's stability, I found that AvroToPojo opwerator
doesn't recover in case of failure and keeps throwing below exception. This
in turn mak
Thank you.. I understand that partitioning is one of the options but is there
a way to decrease the emit rate? Reason I am interested in this option in
the upstream operator is fileReader operator and down stream operator is
filter operator. I tries partitioning the filter operator but still I see
I have a AvroFileInputOperator which has large no of files to be read when
the application starts. While the fire reading goes well at the start, I
start getting below warning in log after reading around 50+ files
Is there something to be worried about and is there a way to mitigate it?
2017-05-2
Thank you Sanjay for your reply. I am using Malhar 3.7.0 but still facing the
issue. I will reevaluate and recreate the issue with all the logs to support
my findings before I report any of the issues.
Regards
Vivek
--
View this message in context:
http://apache-apex-users-list.78494.x6.nabble
Hi Sanjay,
After working for some more time i could find a pattern on how and when the
code breaks but for sure in any situation it doesn't work. Below are my
oservations till now
1. Regarding setting a parameter, as you said, the app_name is optional and
reason is you don't expect to have more t
Hi Sanjay,
I did all required changes but application is still throwing the same error.
I increased the value to 1 (default 100) and even removed the
setting of upstream operator's streaming window customization
dt.application..operator.fsRolling.prop.maxWindowsWithNoDat
Thank you Sanjay. I did go through the code and found the value you mentioned
but was not sure if I should override the value.
Regarding no data being sent to the HiveOutputModule for a number of
windows, it could be a case since the upstream operator to hive module has
the streaming window interv
Hi Sanjay
I waited for the application to rollover to nest file but as soon as the
file reaches the size I defined, the operator started failing with below
error
Any suggestion on this error? FYI. I override the file size into my
application properties to 50 MB from defalut 128MB
2017-05-17 20:
62 matches
Mail list logo