Re: Kryo version and default serializer

2018-06-11 Thread Pramod Immaneni
Aaron, The question might be best served on the d...@apex.apache.org mailing list as you are starting to make changes to the sources. Nothing obvious jumps out to be, you don't need hadoop on your system to build the sources successfully with all the tests passing. The test failures you are

Re: Need help on achieving end to end exactly once with KafkaIn and KafakOut

2018-03-21 Thread Pramod Immaneni
Any window that was not complete by the time the operator died is not replayed by definition (as we don't have all the data in the window) and the output operators should also not expect that. In your case if the operator died during window ..24 then on restart you can expect that the input

Re: Need help on achieving end to end exactly once with KafkaIn and KafakOut

2018-03-15 Thread Pramod Immaneni
Have you tried instantiating FSWindowDataManger and setting it on the operator instance in populateDAG. There may be state associated with the window manager which will be lost if you set a new instance each time in setup. You should be able to use KakfaSinglePortInputOperator directly. On Thu,

Re: How the application recovery works when its started with -originalAppId

2017-08-11 Thread Pramod Immaneni
My bad let me see.. On Fri, Aug 11, 2017 at 11:13 AM, Vivek Bhide wrote: > It is present in class.. just after setCapacity method > > > > -- > View this message in context: http://apache-apex-users-list. > 78494.x6.nabble.com/How-the-application-recovery-works- >

Re: How the application recovery works when its started with -originalAppId

2017-08-11 Thread Pramod Immaneni
Provide an empty constructor as well. On Fri, Aug 11, 2017 at 10:39 AM, Vivek Bhide wrote: > Please refer the LRUCache class from the code base I pasted above. Its > exactly what I m using > > Regards > Vivek > > > > -- > View this message in context:

Re: How the application recovery works when its started with -originalAppId

2017-08-11 Thread Pramod Immaneni
Can you share the code for your class that extends the linked hash map. Thanks On Thu, Aug 10, 2017 at 11:05 PM Vivek Bhide wrote: > Thank You Pramod and Thomas for all your inputs. > > Hi Pramod, > Jira https://issues.apache.org/jira/browse/APEXMALHAR-2526 that Thomas

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Pramod Immaneni
Vivek, Also a slightly more portable modification to what I suggested earlier is to use kryo.getClassLoader() instead of Thread.currentThread().getContextClassLoader() in the JavaSerializer. Thanks On Thu, Aug 10, 2017 at 8:25 PM, Pramod Immaneni <pra...@datatorrent.com> wrote: >

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Pramod Immaneni
XCORE-767 > > Perhaps the fix for first item is what you need? > > Thomas > > > On Thu, Aug 10, 2017 at 8:41 PM, Pramod Immaneni <pra...@datatorrent.com> > wrote: > >> I would dig deeper into why serde of the linked hashmap is failing. There >> are addi

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Pramod Immaneni
I would dig deeper into why serde of the linked hashmap is failing. There are additional logging you can enable in kryo to get more insight. You can even try a standalone kryo test to see if it is a problem with the linkedhashmap itself or because of some other object that was added to it. You

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Pramod Immaneni
It will try to deserialize the old state (from prior applicaiton checkpoints) with your new jars from the apa that you are trying to launch. So if there are structural incompatibilities the deser will fail. On Thu, Aug 10, 2017 at 11:41 AM, Vivek Bhide wrote: > Hi All, >

Re: Apache Apex with Apache slider

2017-08-01 Thread Pramod Immaneni
You might be able to launch the application by running the apex cli as a separate process from your slider code and passing the apa. The apa could be in hdfs. This would however require apex cli to be present on all nodes as your slider code could be running on any node on the cluster. Thanks On

Re: Apache Apex with Apache slider

2017-08-01 Thread Pramod Immaneni
Could you elaborate what your use case is as to why you would need to use slider when apex is already a native yarn application. Thanks On Tue, Aug 1, 2017 at 10:08 AM, Vivek Bhide wrote: > Hi All, > > Is there any resources for reference on implementing Apache Apex

Re: [EXTERNAL] Re: hdfs file write operator is increasing the latency - resulting entire DAG to fail

2017-07-13 Thread Pramod Immaneni
operator to 20Gb. Which I feel is very huge. > <*property*> > <*name*>dt.operator.HDFS_operator.attr.MEMORY_MB > <*value*>20480 > > > > > Please advice. > > > > > > Thanks a lot. > > > > Raja. > > > >

Re: hdfs file write operator is increasing the latency - resulting entire DAG to fail

2017-07-13 Thread Pramod Immaneni
Hi Raja, How many partitions do you have for the file output operator and what would you save your data write rate is in bytes/second. Thanks On Thu, Jul 13, 2017 at 8:13 AM, Raja.Aravapalli wrote: > Team, > > > > We have an apex application that is reading from

Re: Kafka operators with Kerberos authentication

2017-07-07 Thread Pramod Immaneni
Wouldn't the kafka jaas.conf and keytab be already present on the nodes if managing the kafka deployment through the distro? Thanks On Thu, Jul 6, 2017 at 6:05 PM, Thomas Weise wrote: > Hi, > > Has anyone run the Apex Kafka consumer or producer with security enabled? > > I got

Re: I can't deploy my application package

2017-06-27 Thread Pramod Immaneni
Hi Guilherme, I am assuming you mean the datatorrent rts web interface. I am forwarding the message to the group that would be more appropriate to discuss this. Thanks On Tue, Jun 27, 2017 at 2:07 PM, Guilherme Hott wrote: > Hi guys, I am having this problem when I

Re: Is there a way to schedule an operator?

2017-06-13 Thread Pramod Immaneni
ge and start the > process. I think this work but I would like to know if have something more > appropriate. > > On Tue, Jun 13, 2017 at 3:25 PM, Pramod Immaneni <pra...@datatorrent.com> > wrote: > >> There is no built scheduler to schedule the DAGs at a prescribed time, >>

Re: Is there a way to schedule an operator?

2017-06-13 Thread Pramod Immaneni
There is no built scheduler to schedule the DAGs at a prescribed time, you would need to use some external mechanisms. Because it is a daily one-time activity, would something like cron work for you? On Tue, Jun 13, 2017 at 3:22 PM, Guilherme Hott wrote: > Because I am

Re: Managed State

2017-05-08 Thread Pramod Immaneni
Hey Ilya, I think for the second part of the question, you can look at spillable data structures for examples on how to use managed state org.apache.apex.malhar.lib.state.spillable.Spillable* Regarding the first part, from what I know and remember, it is not thread safe, but I can't say with

Re: set attribute of type Map in properties.xml

2017-04-12 Thread Pramod Immaneni
Sorry different braces dt.operator.opname.prop.propname(key) value On Wed, Apr 12, 2017 at 4:23 PM, Pramod Immaneni <pra...@datatorrent.com> wrote: > Hi Sunil, > > Have you tried > > >dt.operator.opname.prop.propname[key] >value > > Thanks &g

Re: File output operator - HDFS Write failing

2017-03-23 Thread Pramod Immaneni
Hi Raja, What version of malhar are you using? Are you extending the AbstractFileOutputOperator or are you use a stock implementation from Malhar? Thanks On Thu, Mar 23, 2017 at 4:57 AM, Raja.Aravapalli wrote: > > > Hi Team, > > > > I have a file output operator,

Re: 3.5.0 apex core build failing with hadoop 2.7.3 dependency

2017-03-23 Thread Pramod Immaneni
So much for semantic versioning.. On Thu, Mar 23, 2017 at 7:15 AM, Munagala Ramanath wrote: > Looks like this was a breaking change introduced in Hadoop 2.7.0: > > In https://hadoop.apache.org/docs/r2.6.5/api/org/apache/hadoop/yarn/conf/ > YarnConfiguration.html we have: >

Re: Apex installation instructions

2017-03-23 Thread Pramod Immaneni
Hi Mohammad, It is being used in production in various places, at least from a DataTorrent perspective we bundle it in DataTorrent RTS and have multiple customers using it in production. The reason you see com.datatorrent is it was originally created by DataTorrent as a proprietary software and

Re: Apex installation instructions

2017-03-22 Thread Pramod Immaneni
What is the error message you are seeing? On Wed, Mar 22, 2017 at 5:51 PM, Mohammad Kargar wrote: > That works in a Dev environment fine. However in a test cluster > environment (where Apex script copied manually) it's failing to find > configurations. > > On Mar 22, 2017

Re: DataTorrent Security/Authentication setup in cluster

2017-03-22 Thread Pramod Immaneni
Hi Srinivas, For datatorrent related questions, please email the dt-users forum https://groups.google.com/forum/#!forum/dt-users. You can still email apex related questions to this group. Thanks On Wed, Mar 22, 2017 at 10:53 AM, Teddy Rusli wrote: > Hi Srinivas, > > Is

Re: Shell script for launching and shutting down

2017-01-10 Thread Pramod Immaneni
It should be in your datatorrent installation, under the bin folder. If you have an older version you would see a command 'dtcli' instead. Thanks On Tue, Jan 10, 2017 at 12:10 PM, Sunil Parmar wrote: > Where is the executable ‘apex’; it doesn’t appear to be in my

Re: PARTITION_PARALLEL Vs Regular mode

2016-12-22 Thread Pramod Immaneni
Arvindan, When you had the MxN case with 100 kafka consumers sending to 120 parquet writers what was the cpu utilization of the parquet containers. Was it close to 100% or did you have spare cycles? I am trying to determine if it is an IO bottleneck or processing. Thanks On Thu, Dec 22, 2016 at

Re: Connection refused exception

2016-11-29 Thread Pramod Immaneni
AppMasterService.java:1025) > > at com.datatorrent.stram.StreamingAppMasterService.run( > StreamingAppMasterService.java:647) > > at com.datatorrent.stram.StreamingAppMaster.main( > StreamingAppMaster.java:104) > > > > *From: *Pramod Immaneni &l

Re: Connection refused exception

2016-11-29 Thread Pramod Immaneni
Brandon, This is not related to kafkaInputOperator talking to Kafka. Do you see any other exceptions in the logs around this exception before or after? On Tue, Nov 29, 2016 at 6:54 AM, Feldkamp, Brandon (CONT) < brandon.feldk...@capitalone.com> wrote: > Hello, > > > > I’m wondering if anyone

Re: How to use DataTorrent

2016-11-03 Thread Pramod Immaneni
You can build it all in Java. On Thu, Nov 3, 2016 at 4:01 PM, dimple wrote: > Is it necessary to use dtAssemble, or I can get away with just Java? > > Thanks, > Dimple > > > > -- > View this message in context: http://apache-apex-users-list. >

Re: balanced of Stream Codec

2016-10-16 Thread Pramod Immaneni
Hi Sunil, Have you tried an alternate hashing function other than java hashcode that might provide a more uniform distribution of your data? The google guava library provides a set of hashing strategies, like murmur hash, that is reported to have lesser hash collisions in different cases. Below

Re: A proposal for Malhar

2016-08-09 Thread Pramod Immaneni
gt;> Malhar. >>>>> >>> > >>>>> >>> > As stated before, the reason why we would like to retire >>>>> operators in >>>>> >>> > Malhar is because some of them were written a long time ago >>>

Re: Apache Apex and Dip blog

2016-08-01 Thread Pramod Immaneni
Hey Neeraj, Nice blog. It would be great to link to it in the apex website blog section in this page https://apex.apache.org/docs.html On Mon, Aug 1, 2016 at 2:09 PM, Neeraj Sabharwal wrote: > > > >

Re: A proposal for Malhar

2016-07-12 Thread Pramod Immaneni
and 3 are not mutually exclusive. Any thoughts? > > David > > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni <pra...@datatorrent.com> > wrote: > >> I wanted to close the loop on this discussion. In general everyone seemed >> to be favorable to this idea with

Re: how to increase lifetime of hdfs delegation tokens ?

2016-07-05 Thread Pramod Immaneni
30 > > > > dt.namenode.delegation.token.max-lifetime > 30 > > > > dt.attr.DEBUG > true > > > For more information about application auto-fetching new tokens read here > > https://github.com/apache/apex-c

Re: how to increase lifetime of hdfs delegation tokens ?

2016-07-01 Thread Pramod Immaneni
ers@apex.apache.org" <users@apex.apache.org> > Date: Monday, June 20, 2016 at 5:43 PM > To: "users@apex.apache.org" <users@apex.apache.org> > Subject: Re: how to increase lifetime of hdfs delegation tokens ? > > > Sure Pramod. Please respond o

Re: how to increase lifetime of hdfs delegation tokens ?

2016-06-20 Thread Pramod Immaneni
t; application. > Can I set these in properties.xml file and it will still work ? > > > Regards, > Raja. > > From: Pramod Immaneni <pra...@datatorrent.com> > Reply-To: "users@apex.apache.org" <users@apex.apache.org> > Date: Monday, June 20, 2016 at 4:32 P

Re: how to increase lifetime of hdfs delegation tokens ?

2016-06-20 Thread Pramod Immaneni
Hi Raja, Yes the keytab would be copied over to HDFS and reused for getting a new token before the old one expires. By default it is 7 days. If it is different in your cluster please set the properties dt.resourcemanager.delegation.token.max-lifetime and dt.namenode.delegation.token.max-lifetime