Hi Folks, Would like to ask for suggestions on debugging SelectHiveQL processors, we've seen a very odd error mode twice now, where a SelectHiveQL processor which had been running fine suddenly becomes "stuck". This is on 1.6.0, so a bit dated compared to 1.9.2, but i'm still very puzzled at the lack of error indications.
Symptom; processor is running fine, continues to report 'running' on canvas but the input port begins to queue up and show backlogs. Stopping the processor in the canvas reports success and shows 'stopped', but trying to start it again gets the popup "No eligible components are selected. Please select the components to be stopped.". Making sure the processor is clearly selected reports same error. Only way to get it unstuck is to restart the primary, this appears to kill the affected threads and allow the processor to begin running again, at that point it's ok again. Issue appears directly related to the processor itself, as opposed to say the ConnectionPool. On that, tried restarting the ConnectionPool being used, stop attempt hangs on the affected processor, to the point the stop fails. Another oddity, tried stopping upstream objects to the affected processor, they report "cannot be disabled because it is referenced by 1 components that are currently running", even though the canvas clearly shows that processor as stopped. What's really strange is the lack of error indications anywhere, see nothing in the logs at all regarding the affected processor, until primary restart. Then see the start event when the processor is coming back online "StandardProcessScheduler Starting SelectHiveQL id=". Appreciate any suggestions on additional logging or other resources that would help debug. Thanks! patw
