Hi,

I am new to gobblin and trying to use gobblin on yarn to import data from 
postgres to store it in hdfs as avro files (daily partitioned).

(Initially I faced an issue related to jodattime not handling  postgres 
timestamp with microsecond precision in 
JsonElementConversionFactory.DateConverter
which I am yet to find a solution but using watermark type simple with user 
defined partition to test out remaining configuration.)

Right now I am facing another issue:

The job gets stuck after completing more than half of workunits without any 
error logs.
The helix debug logs indicate the tasks (for those partition files that are 
missing in taskoutput dir) changed from INIT to RUNNINg in one task runner - 
say task runner 1,
And then later I can see same transition in another task runner say task runner 
4 - to RUNNING.
In between there are logs indicating task changed from RUNNING to COMPLETED. - 
But task output file is missing.
And job never finishes it gets stuck. Helix logs suggests it doesn't see any 
pending tasks and no task assigned to any task runner.
The config works for lesser volume of data (less number of partitions.

Not sure how can I troubleshoot this one. Appreciate your suggestions.

Thanks & Regards,
Praveen



CONFIDENTIALITY NOTICE: This message is the property of International Game 
Technology PLC and/or its subsidiaries and may contain proprietary, 
confidential or trade secret information. This message is intended solely for 
the use of the addressee. If you are not the intended recipient and have 
received this message in error, please delete this message from your system. 
Any unauthorized reading, distribution, copying, or other use of this message 
or its attachments is strictly prohibited.

Reply via email to