Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Pramod Immaneni
Vivek, Also a slightly more portable modification to what I suggested earlier is to use kryo.getClassLoader() instead of Thread.currentThread().getContextClassLoader() in the JavaSerializer. Thanks On Thu, Aug 10, 2017 at 8:25 PM, Pramod Immaneni wrote: > I believe

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Pramod Immaneni
I believe those relate to different problems. This is a scenario where part of deserializarion is being outsourced to an external deserializer that is not using the correct class loader. The suggested fix to the behavior of the external serde seemed to have worked though I plan to follow up with

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Thomas Weise
There are couple bugs that were recently identified that look related to this: https://issues.apache.org/jira/browse/APEXMALHAR-2526 https://issues.apache.org/jira/browse/APEXCORE-767 Perhaps the fix for first item is what you need? Thomas On Thu, Aug 10, 2017 at 8:41 PM, Pramod Immaneni

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Pramod Immaneni
I would dig deeper into why serde of the linked hashmap is failing. There are additional logging you can enable in kryo to get more insight. You can even try a standalone kryo test to see if it is a problem with the linkedhashmap itself or because of some other object that was added to it. You

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Vivek Bhide
Thanks Pramod.. This seems to have done trick.. I will check again when I have some data to process to see if that goes well with it. I am quite confident that it will Just curious, Is this the best way to handle this issue or if there is any other elegant way it can be addressed? Regards Vivek

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Vivek Bhide
I have already pasted the stack trace in my original post. Also can you please confirm what is the classpath kryo is using v/s default Java serializer and where exactly is set for kryo? Regards Vivek -- View this message in context:

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Vivek Bhide
Hi Pramod, As I told we have LRUCache (LinkedHashMap of ) which needs to be serialized and it is initialized in operator constructor. What we found that, when operator is serialized for checkpointing the content of the this LRUcache is not getting serialized and instead its just an

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Vivek Bhide
Hi Pramod, I get this error even when I try to resubmit the exact same apa Is there any other angle of this problem that i should look for? Regards Vivek -- View this message in context:

Re: How the application recovery works when its started with -originalAppId

2017-08-10 Thread Pramod Immaneni
It will try to deserialize the old state (from prior applicaiton checkpoints) with your new jars from the apa that you are trying to launch. So if there are structural incompatibilities the deser will fail. On Thu, Aug 10, 2017 at 11:41 AM, Vivek Bhide wrote: > Hi All, >

Re: data missing in AbstractFileOutPutOperator

2017-08-10 Thread Vivek Bhide
Hi Chiru, Have you tried waiting till the time your output file in HDFS rolls over (from .tmp to .0)? I have observed this in our case that if you query .tmp file, it may not show all the records written to it. The reason could be that the file is still eligible to be written to and output

Re: data missing in AbstractFileOutPutOperator

2017-08-10 Thread chiranjeevi vasupilli
I hope you got the issue. its a random issue, the same record is not missing in all the runs. In DT console we can see at the writer operator processed required records but the count is not matching with the data written in HDFS. the sequence of missing tuples , we are not sure because its