Ah, understood.  I'll take a closer look at the logs and make sure that I 
didn't accidentally miss those lines when I pulled together the logs for this 
email chain.

Thanks,
David Mabry
On 1/30/18, 8:34 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:

    Hi David,
    
    I encountered the UnsupportAnswer once before, when I made some changes in
    the kvm plugin.
    
    Normally there should be some network configurations in the agent.log but I
    do not see it.
    
    -Wei
    
    
    2018-01-30 15:00 GMT+01:00 David Mabry <dma...@ena.com.invalid>:
    
    > Hi Wei,
    >
    > I detached the iso and received the same error.  Just out of curiosity,
    > what leads you to believe it is something in the vxlan code?  I guess at
    > this point, attaching a remote debugger to the agent in question might be
    > the best way to get to the bottom of what is going on.
    >
    > Thanks in advance for the help.  I really, really appreciate it.
    >
    > Thanks,
    > David Mabry
    >
    > On 1/30/18, 3:30 AM, "Wei ZHOU" <ustcweiz...@gmail.com> wrote:
    >
    >     The answer should be caused by an exception in the cloudstack agent.
    >     I tried to migrate a vm in our testing env, it is working.
    >
    >     there are some different between our env and yours.
    >     (1) vlan VS vxlan
    >     (2) no ISO VS attached ISO
    >     (3) both of us use ceph and centos7.
    >
    >     I suspect it is caused by codes on vxlan.
    >     However, could you detach the ISO and try again ?
    >
    >     -Wei
    >
    >
    >
    >     2018-01-29 19:48 GMT+01:00 David Mabry <dma...@ena.com.invalid>:
    >
    >     > Good day Cloudstack Devs,
    >     >
    >     > I've run across a real head scratcher.  I have two VMs, (initially 3
    > VMs,
    >     > but more on that later) on a single host, that I cannot live migrate
    > to any
    >     > other host in the same cluster.  We discovered this after attempting
    > to
    >     > roll out patches going from CentOS 7.2 to CentOS 7.4.  Initially, we
    >     > thought it had something to do with the new version of libvirtd or
    > qemu-kvm
    >     > on the other hosts in the cluster preventing these VMs from
    > migrating, but
    >     > we are able to live migrate other VMs to and from this host without
    > issue.
    >     > We can even create new VMs on this specific host and live migrate
    > them
    >     > after creation with no issue.  We've put the migration source agent,
    >     > migration destination agent and the management server in debug and
    > don't
    >     > seem to get anything useful other than "Unsupported command".
    > Luckily, we
    >     > did have one VM that was shutdown and restarted, this is the 3rd VM
    >     > mentioned above.  Since that VM has been restarted, it has no issues
    > live
    >     > migrating to any other host in the cluster.
    >     >
    >     > I'm at a loss as to what to try next and I'm hoping that someone out
    > there
    >     > might have had a similar issue and could shed some light on what to
    > do.
    >     > Obviously, I can contact the customer and have them shutdown their
    > VMs, but
    >     > that will potentially just delay this problem to be solved another
    > day.
    >     > Even if shutting down the VMs is ultimately the solution, I'd still
    > like to
    >     > understand what happened to cause this issue in the first place with
    > the
    >     > hopes of preventing it in the future.
    >     >
    >     > Here's some information about my setup:
    >     > Cloudstack 4.8 Advanced Networking
    >     > CentOS 7.2 and 7.4 Hosts
    >     > Ceph RBD Primary Storage
    >     > NFS Secondary Storage
    >     > Instance in Question for Debug: i-532-1392-NSVLTN
    >     >
    >     > I have attached relevant debug logs to this email if anyone wishes
    > to take
    >     > a look.  I think the most interesting error message that I have
    > received is
    >     > the following:
    >     >
    >     > 468390:2018-01-27 08:59:35,172 DEBUG [c.c.a.t.Request]
    >     > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
    > ctx-8e7f45ad)
    >     > (logid:f0888362) Seq 22-942378222027276319: Received:  { Ans: ,
    > MgmtId:
    >     > 14038012703634, via: 22(csh02c01z01.nsvltn.ena.net), Ver: v1,
    > Flags: 110,
    >     > { UnsupportedAnswer } }
    >     > 468391:2018-01-27 08:59:35,172 WARN  [c.c.a.m.AgentManagerImpl]
    >     > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
    > ctx-8e7f45ad)
    >     > (logid:f0888362) Unsupported Command: Unsupported command issued:
    >     > com.cloud.agent.api.PrepareForMigrationCommand.  Are you sure you
    > got the
    >     > right type of server?
    >     > 468392:2018-01-27 08:59:35,179 ERROR [c.c.v.VmWorkJobHandlerProxy]
    >     > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
    > ctx-8e7f45ad)
    >     > (logid:f0888362) Invocation exception, caused by:
    > com.cloud.exception.AgentUnavailableException:
    >     > Resource [Host:22] is unreachable: Host 22: Unable to prepare for
    > migration
    >     > due to Unsupported command issued: com.cloud.agent.api.
    > PrepareForMigrationCommand.
    >     > Are you sure you got the right type of server?
    >     > 468393:2018-01-27 08:59:35,179 INFO  [c.c.v.VmWorkJobHandlerProxy]
    >     > (Work-Job-Executor-6:ctx-188ea30f job-181792/job-181802
    > ctx-8e7f45ad)
    >     > (logid:f0888362) Rethrow exception com.cloud.exception.
    > AgentUnavailableException:
    >     > Resource [Host:22] is unreachable: Host 22: Unable to prepare for
    > migration
    >     > due to Unsupported command issued: com.cloud.agent.api.
    > PrepareForMigrationCommand.
    >     > Are you sure you got the right type of server?
    >     >
    >     > I've tracked this "Unsupported command" down in the CS 4.8 code to
    >     > cloudstack/api/src/com/cloud/agent/api/Answer.java which is the
    > generic
    >     > answer class.  I believe where the error is really being spawned
    > from is
    >     > cloudstack/engine/orchestration/src/com/cloud/
    >     > vm/VirtualMachineManagerImpl.java.  Specifically:
    >     >         Answer pfma = null;
    >     >         try {
    >     >             pfma = _agentMgr.send(dstHostId, pfmc);
    >     >             if (pfma == null || !pfma.getResult()) {
    >     >                 final String details = pfma != null ?
    > pfma.getDetails() :
    >     > "null answer returned";
    >     >                 final String msg = "Unable to prepare for migration
    > due to
    >     > " + details;
    >     >                 pfma = null;
    >     >                 throw new AgentUnavailableException(msg, dstHostId);
    >     >             }
    >     >
    >     > The pfma returned must be in error or is never returned and 
therefore
    >     > still null.  That answer appears that it should be coming from the
    >     > destination agent, but for the life of me I can't figure out what
    > the root
    >     > cause of this error is beyond, "Unsupported command issued".  What
    > command
    >     > is unsupported?  My guess is that it could be something wrong with
    > the dxml
    >     > that is generated and passed to the destination host, but I have as
    > yet
    >     > been unable to catch that dxml in debug.
    >     >
    >     > Any help or guidance is greatly appreciated.
    >     >
    >     > Thanks,
    >     > David Mabry
    >     >
    >     >
    >
    >
    >
    

Reply via email to