Thanks all for the quick comments!
If i understand the discussion correctly, I will just change all the added log 
printing statements to debug verbosity. I will upload a new patch for that 
shortly. 

Now a bit back story: the reason we are doing this is that we recently did an 
analysis on many bugs collected from JIRA to understand why today’s cloud 
system fails. And we found that almost all of the cluster-wide failures are 
caused by buggy exception handling, which would often turn a component failure 
into a cluster-wide one. One of the common bug pattern is ignoring some 
exceptions — allowing them to propagate and finally turn into disaster. 
Therefore we built a simple static checking tool just to check some simple 
rules for exception handling, such as if an exception is ignored. Admittedly, 
it would be much harder to reason about the potential consequences caused for 
ignoring a certain exception, that’s why without much more domain knowledge I 
can only recommend to 1) avoid over-catching an exception, especially when the 
handling logic will swallow it, and 2) log them, as what this patch does.

Nevertheless, it seems the four cases I mentioned in JIRA-6242 are particularly 
suspicious. It might be worthwhile to double check their correctness if you 
have time. I am reposting them below.

Thanks!
Ding

========================= 
Case 1: 
Line: 375, File: "com/cloud/network/ovs/OvsTunnelManagerImpl.java" 

326: try { 
327: String myIp = getGreEndpointIP(dest.getHost(), nw); 
  .. .. .. 
373: } catch (Exception e) { 
374: // I really thing we should do a better handling of these exceptions 
375: s_logger.warn("Ovs Tunnel network created tunnel failed", e); 
376: } 

Comment seems to suggest for a better handling. 
========================= 
========================= 
Case 2: 
Line: 2295, File: "com/cloud/resource/ResourceManagerImpl.java" 

2284: try { 
2285: /* 
2286: * FIXME: this is a buggy logic, check with alex. Shouldn't 
2287: * return if propagation return non null 
2288: */ 
2289: Boolean result = _clusterMgr.propagateResourceEvent(h.getId(), 
ResourceState.Event.UpdatePassword); 
2290: if (result != null) { 
2291: return result; 
2292: } 
2293: 
2294: doUpdateHostPassword(h.getId()); 
2295: } catch (AgentUnavailableException e) { 
2296: } 

Seem from the comment the logic should be fixed. 
A similar code snipeet is at: 
  Line: 2276, File: "com/cloud/resource/ResourceManagerImpl.java" 
========================= 

========================= 
Case 3: 

  Line: 184, File: 
"org/apache/cloudstack/api/command/user/autoscale/CreateAutoScaleVmGroupCmd.java"
 

174: try 
175: { 
176: success = _autoScaleService.configureAutoScaleVmGroup(this); 
177: if (success) { 
178: vmGroup = _entityMgr.findById(AutoScaleVmGroup.class, getEntityId()); 
179: AutoScaleVmGroupResponse responseObject = 
_responseGenerator.createAutoScaleVmGroupResponse(vmGroup); 
180: setResponseObject(responseObject); 
181: responseObject.setResponseName(getCommandName()); 
182: } 
183: } catch (Exception ex) { 
184: // TODO what will happen if Resource Layer fails in a step inbetween 
185: s_logger.warn("Failed to create autoscale vm group", ex); 
186: } 

The comment, "TODO", seems to suggest for better handling. 
========================= 
========================= 
Case 4: 

  Line: 222, File: "com/cloud/api/ApiDispatcher.java" 
This snippet is moved to ParamProcessWorker.java in the trunk. 

203: try { 
204: setFieldValue(field, cmd, paramObj, parameterAnnotation); 
     .. .. 
220: } catch (CloudRuntimeException cloudEx) { 
221: s_logger.error("CloudRuntimeException", cloudEx); 
222: // FIXME: Better error message? This only happens if the API command is 
not executable, which typically 
223: //means 
224: // there was 
225: // and IllegalAccessException setting one of the parameters. 
226: throw new ServerApiException(ApiErrorCode.INTERNAL_ERROR, "Internal error 
executing API command " + cmd.getCommandName().substring(0, 
cmd.getCommandName().length() - 8)); 
227: } 

The "FIXME" comment seems to suggest for getter error message. 
=========================
On Apr 2, 2014, at 4:37 PM, Daan Hoogland <daan.hoogl...@gmail.com> wrote:

> Ding Yuan,
> 
> Any objections to that?
> 
> On Wed, Apr 2, 2014 at 10:32 PM, Alena Prokharchyk
> <alena.prokharc...@citrix.com> wrote:
>> 
>> 
>> On 4/2/14, 1:27 PM, "Daan Hoogland" <daan.hoogl...@gmail.com> wrote:
>> 
>>> I think we agree indeed. Doesn't mean we should start this discuss
>>> thread or write a arch guideline on the wiki somewhere. Maybe Ding
>>> Yuan wants to do a preliminary version of it?
>> 
>> Wiki guide would be useful indeed.
>> 
>> 
>>> 
>>> In the meantime I don't think that it hurts for the present patch to
>>> do everything in debug and decide about higher levels needed later.
>> 
>> Agree.
>> 
>> 
>>> 
>>> regards,
>>> 
>>> On Wed, Apr 2, 2014 at 10:11 PM, Alena Prokharchyk
>>> <alena.prokharc...@citrix.com> wrote:
>>>> Daan,
>>>> 
>>>> Correct me if I¹m wrong, but all of the logging added by Ding, fall
>>>> under
>>>> "to go with it or to indicate passing a certain code path². I¹ve just
>>>> noticed that some of them were added with DEBUG, and some with WARN
>>>> level,
>>>> and wanted to correct that.
>>>> 
>>>> So we should:
>>>> 
>>>> 1) For sure: never print them out in WARN as there is nothing admin
>>>> should
>>>> do in this case, because the code just handles them by ignoring.
>>>> 2) Figure out what would be the correct level to log them with: INFO or
>>>> DEBUG
>>>> 
>>>> From ³Logging best practices² articles, I can see that people use INFO
>>>> as
>>>> a ³storyline² of normal application behavior, and DEBUG for sort of
>>>> information that helps you to track down the failure cases scenarios.
>>>> To
>>>> me, stuff added by Ding, falls under second category. But I might be
>>>> wrong
>>>> as I don¹t recall on the spot any discussions happening on the debug
>>>> topic, from the mailing list.
>>>> 
>>>> -Alena.
>>>> 
>>>> 
>>>> On 4/2/14, 12:57 PM, "Daan Hoogland" <daan.hoogl...@gmail.com> wrote:
>>>> 
>>>>> Alena,
>>>>> 
>>>>> What I read in your comment is a description of INFO vs WARN. Debug
>>>>> would be only for outputting stacktraces to go with it or to indicate
>>>>> passing a certain code path.
>>>>> 
>>>>> Agree?
>>>>> 
>>>>> On Wed, Apr 2, 2014 at 8:31 PM, Alena Prokharchyk
>>>>> <alena.prokharc...@citrix.com> wrote:
>>>>>> 
>>>>>> -----------------------------------------------------------
>>>>>> This is an automatically generated e-mail. To reply, visit:
>>>>>> https://reviews.apache.org/r/19917/#review39324
>>>>>> -----------------------------------------------------------
>>>>>> 
>>>>>> 
>>>>>> Is there a reason why logs for some exceptions are being logged in
>>>>>> DEBUG mode, and some in WARN? From my point of view, if the code only
>>>>>> catches it and doesn't error out, it should be logged in DEBUG. Lots of
>>>>>> Admins are seeking for WARN statements in the log, and they might be
>>>>>> confused seeing WARN w/o further failure or retry.
>>>>>> 
>>>>>> - Alena Prokharchyk
>>>>>> 
>>>>>> 
>>>>>> On April 2, 2014, 1:55 p.m., Ding Yuan wrote:
>>>>>>> 
>>>>>>> -----------------------------------------------------------
>>>>>>> This is an automatically generated e-mail. To reply, visit:
>>>>>>> https://reviews.apache.org/r/19917/
>>>>>>> -----------------------------------------------------------
>>>>>>> 
>>>>>>> (Updated April 2, 2014, 1:55 p.m.)
>>>>>>> 
>>>>>>> 
>>>>>>> Review request for cloudstack.
>>>>>>> 
>>>>>>> 
>>>>>>> Repository: cloudstack-git
>>>>>>> 
>>>>>>> 
>>>>>>> Description
>>>>>>> -------
>>>>>>> 
>>>>>>> This is the patch for JIRA-6242. See
>>>>>>> https://issues.apache.org/jira/browse/CLOUDSTACK-6242 for more
>>>>>>> details.
>>>>>>> Thanks!
>>>>>>> 
>>>>>>> 
>>>>>>> Diffs
>>>>>>> -----
>>>>>>> 
>>>>>>> 
>>>>>>> engine/orchestration/src/com/cloud/agent/manager/AgentManagerImpl.java
>>>>>>> 0d41bc1
>>>>>>> 
>>>>>>> engine/orchestration/src/com/cloud/agent/manager/ClusteredAgentManager
>>>>>>> Im
>>>>>>> pl.java 01508a4
>>>>>>> 
>>>>>>> engine/orchestration/src/com/cloud/vm/VirtualMachineManagerImpl.java
>>>>>>> 3e088db
>>>>>>> 
>>>>>>> engine/orchestration/src/org/apache/cloudstack/engine/datacenter/entit
>>>>>>> y/
>>>>>>> api/db/dao/EngineDataCenterDaoImpl.java 4b6818e
>>>>>>>  engine/schema/src/com/cloud/dc/dao/DataCenterDaoImpl.java ea5039f
>>>>>>>  engine/schema/src/com/cloud/host/dao/HostDaoImpl.java 426c90d
>>>>>>>  engine/schema/src/com/cloud/storage/dao/StoragePoolHostDaoImpl.java
>>>>>>> e42eaf4
>>>>>>>  engine/schema/src/com/cloud/storage/dao/VMTemplateDaoImpl.java
>>>>>>> 34fdca5
>>>>>>>  engine/schema/src/com/cloud/upgrade/dao/Upgrade2214to30.java
>>>>>>> 58dd916
>>>>>>>  engine/schema/src/com/cloud/vm/dao/ConsoleProxyDaoImpl.java 5e9c2f0
>>>>>>>  engine/schema/src/com/cloud/vm/dao/SecondaryStorageVmDaoImpl.java
>>>>>>> 1f382d6
>>>>>>> 
>>>>>>> engine/storage/src/org/apache/cloudstack/storage/datastore/DataObjectM
>>>>>>> an
>>>>>>> agerImpl.java 6ed1274
>>>>>>> 
>>>>>>> framework/ipc/src/org/apache/cloudstack/framework/serializer/OnwireCla
>>>>>>> ss
>>>>>>> Registry.java 83c8a42
>>>>>>> 
>>>>>>> plugins/hypervisors/xen/src/com/cloud/hypervisor/xen/discoverer/XcpSer
>>>>>>> ve
>>>>>>> rDiscoverer.java 0ad6dc4
>>>>>>> 
>>>>>>> plugins/hypervisors/xen/src/com/cloud/hypervisor/xen/resource/XenServe
>>>>>>> rC
>>>>>>> onnectionPool.java b779085
>>>>>>> 
>>>>>>> plugins/hypervisors/xen/src/com/cloud/hypervisor/xen/resource/XenServe
>>>>>>> rS
>>>>>>> torageProcessor.java e512046
>>>>>>> 
>>>>>>> plugins/storage/volume/solidfire/src/org/apache/cloudstack/storage/dat
>>>>>>> as
>>>>>>> tore/lifecycle/SolidFirePrimaryDataStoreLifeCycle.java af6a77a
>>>>>>>  server/src/com/cloud/resource/ResourceManagerImpl.java f9a59ba
>>>>>>>  server/src/com/cloud/server/ConfigurationServerImpl.java b8da4c8
>>>>>>> 
>>>>>>> services/console-proxy/server/src/com/cloud/consoleproxy/ConsoleProxyT
>>>>>>> hu
>>>>>>> mbnailHandler.java 06f21d3
>>>>>>>  utils/src/com/cloud/utils/net/NetUtils.java 6350986
>>>>>>> 
>>>>>>> Diff: https://reviews.apache.org/r/19917/diff/
>>>>>>> 
>>>>>>> 
>>>>>>> Testing
>>>>>>> -------
>>>>>>> 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Ding Yuan
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Daan
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Daan
>> 
> 
> 
> 
> -- 
> Daan
> 

Reply via email to