Hi, I am new to gobblin and trying to use gobblin on yarn to import data from postgres to store it in hdfs as avro files (daily partitioned).
(Initially I faced an issue related to jodattime not handling postgres timestamp with microsecond precision in JsonElementConversionFactory.DateConverter which I am yet to find a solution but using watermark type simple with user defined partition to test out remaining configuration.) Right now I am facing another issue: The job gets stuck after completing more than half of workunits without any error logs. The helix debug logs indicate the tasks (for those partition files that are missing in taskoutput dir) changed from INIT to RUNNINg in one task runner - say task runner 1, And then later I can see same transition in another task runner say task runner 4 - to RUNNING. In between there are logs indicating task changed from RUNNING to COMPLETED. - But task output file is missing. And job never finishes it gets stuck. Helix logs suggests it doesn't see any pending tasks and no task assigned to any task runner. The config works for lesser volume of data (less number of partitions. Not sure how can I troubleshoot this one. Appreciate your suggestions. Thanks & Regards, Praveen CONFIDENTIALITY NOTICE: This message is the property of International Game Technology PLC and/or its subsidiaries and may contain proprietary, confidential or trade secret information. This message is intended solely for the use of the addressee. If you are not the intended recipient and have received this message in error, please delete this message from your system. Any unauthorized reading, distribution, copying, or other use of this message or its attachments is strictly prohibited.
