Hi Sanjay,

After working for some more time i could find a pattern on how and when the
code breaks but for sure in any situation it doesn't work. Below are my
oservations till now

1. Regarding setting a parameter, as you said, the app_name is optional and
reason is you don't expect to have more than one streamingapplications in
your project. I think the app_name will matter in case if there are more
than one streamingapplications in your .apa with properties in same .xml
file
2. I tried setting the maxWindowsWithNoData to very high value but only way
I could set it up is by using * in place of operator name. The reason  is,
HiveOutputModule doen't accept it as a parameter and instead it is one of
the operators params from HiveOutputModule i.e
AbstractFSRollingOutputOperator. At this point, there is no provision for
setting the parameter which are embeded in a module, even if using
<modulename$operatorname> pattern, and it is module's responsibility to
accept it as a level 1 operator from properties file and set it for the
level 2 operator when it is building the DAG. I could verify this with a
quick test case for another module that I have built in my project and can
share the code base for the same
3. File rollup depend on 2 params maxWindowsWithNoData (from
AbstractFSRollingOutputOperator ) and maxLength from HiveModule
        case 1 : maxWindowsWithNoData set to high no and maxlenght = 50MB 
(default
128MB)
        Result : In this case, the file rollup doesn't happen until the 
emptywindow
count reach to this point. I could that there were multiple 50 MB files
created under <hdfs_dir>/<yarn_app_id>/10/<partition_col> location but none
of the filed rolled up from .tmp to final file even after running the app
for more than 10 hours

        case 2 : maxWindowsWithNoData set to 480 (4 mins) and maxlenght = 50MB
(default 128MB)
        Result : In this case if maxlenght limit reaches first I get below
exception, nullpointer again but the stack trace is different and if the
maxWindowsWithNoData reaches first then I get the same null pointer that I
reported at first place

        2017-05-19 10:02:37,401 INFO  stram.StreamingContainerParent
(StreamingContainerParent.java:log(170)) - child msg:
[container_e3092_1491920474239_131026_01_000016] Entering heartbeat loop..
context:
PTContainer[id=9(container_e3092_1491920474239_131026_01_000016),state=ALLOCATED,operators=[PTOperator[id=10,name=hiveOutput$fsRolling,state=PENDING_DEPLOY]]]
                2017-05-19 10:02:38,414 INFO  stram.StreamingContainerManager
(StreamingContainerManager.java:processHeartbeat(1486)) - Container
container_e3092_1491920474239_131026_01_000016 buffer server:
d-d7zvfz1.target.com:45373
                2017-05-19 10:02:38,725 INFO  stram.StreamingContainerParent
(StreamingContainerParent.java:log(170)) - child msg: Stopped running due to
an exception. java.lang.NullPointerException
                        at
com.datatorrent.lib.io.fs.AbstractFileOutputOperator.requestFinalize(AbstractFileOutputOperator.java:742)
                        at
com.datatorrent.lib.io.fs.AbstractFileOutputOperator.rotate(AbstractFileOutputOperator.java:883)
                        at
com.datatorrent.contrib.hive.AbstractFSRollingOutputOperator.rotateCall(AbstractFSRollingOutputOperator.java:186)
                        at
com.datatorrent.contrib.hive.AbstractFSRollingOutputOperator.endWindow(AbstractFSRollingOutputOperator.java:227)
                        at
com.datatorrent.stram.engine.GenericNode.processEndWindow(GenericNode.java:153)
                        at 
com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:397)
                        at
com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1428)
                 context:
PTContainer[id=9(container_e3092_1491920474239_131026_01_000016),state=ACTIVE,operators=[PTOperator[id=10,name=hiveOutput$fsRolling,state=PENDING_DEPLOY]]]


In any case the code always fails. I was really excited to have thi 
incorporated but for now, I had kept it aside and sticking to simple HDFS
sink. Will work on it again to find more as time permits

Let me know your thoughts on this

Regards
Vivek



--
View this message in context: 
http://apache-apex-users-list.78494.x6.nabble.com/NullPointerException-at-AbstractFSRollingOutputOperator-while-using-HiveOutputModule-tp1625p1639.html
Sent from the Apache Apex Users list mailing list archive at Nabble.com.

Reply via email to