On Tue, Sep 26, 2017 at 1:58 PM, Alexander Wels <[email protected]> wrote:
> On Tuesday, September 26, 2017 3:26:44 AM EDT Tomas Jelinek wrote: > > On Tue, Sep 26, 2017 at 9:17 AM, Miroslava Voglova <[email protected]> > > > > wrote: > > > From 4.0 architecture family was renamed in script > > > 04_00_0080_rename_architecture_family. So 'HotPlugCpuSupported', > > > 'HotUnplugCpuSupported', ' > > > HotPlugMemorySupported', 'HotUnplugMemorySupported', > > > 'IsMigrationSupported', 'IsMemorySnapshotSupported' and > > > 'IsSuspendSupported' are all in db with x86 not x86_64. In my point of > > > view nothing wrong with that particular line in [1]. > > > > > > Could be that somewhere in code is not used architecture family, but > host > > > architecture, when asked for value of this ConfigValues. But that would > > > throw exception even before my patch, because > '{"x86:"true","ppc":"true"}' > > > was default value for HotPlugMemorySupported. > > > > I see a code path where the cluster arch can be set to x86_64 - it is > > always executed for external VMs (imported from external provider or > > unmanaged). It does not happen all the time, it is only a fallback if the > > arch type is not known/reported etc. > > > > @Alexander: by any chance, was this VM an unmanaged one? Or imported? In > > logs you should find something like: > > "Illegal architecture type: {}, replacing with x86_64" or "null > > architecture type, replacing with x86_64, {}". > > > > Also, if you create a new VM, can you start it? > > > > No its an old database though from pre 4.0 times. These VMs have never been > unmanaged or imported from external providers. I did not see that in the > log, > I had to manuall step through the code to end up in the right place that > causes the NPE. Like I said before line 23 in FeatureSupported.java is the > culprit IMO. It does: > > String value = archOptions.get(arch.name()); > > arch is ArchitectureType, and arch.name returns x86_64, and if I > understand > right they should have done arch.getFamily().name() which does happen 2 > lines > below it. Honestly I don't understand how any VMs are able to run with the > code like that since they all check to see if you can do memory hot plug > before starting, and that check runs through this piece of code, which > based > on the contents of [1] should return an npe since the database should not > contain the x86_64 entries. > the reason it did not work is that there was a syntactic error in the vdc_options table causing the Config.<Map>getValue(feature, version.getValue()); to return null. The VMs normally run, because if there is no entry for x86_64 than it checks x86 two lines below. > > > > [1] > > > *https://gerrit.ovirt.org/#/c/81464/7/packaging/dbscripts/ > upgrade/pre_upg > > > rade/0000_config.sql > > > <https://gerrit.ovirt.org/#/c/81464/7/packaging/dbscripts/ > upgrade/pre_upg > > > rade/0000_config.sql>* > > > > > > On Tue, Sep 26, 2017 at 9:00 AM, Tomas Jelinek <[email protected]> > > > > > > wrote: > > >> On Mon, Sep 25, 2017 at 10:08 PM, Roy Golan <[email protected]> > wrote: > > >>> On Mon, 25 Sep 2017 at 22:52 Alexander Wels <[email protected]> > wrote: > > >>>> On Monday, September 25, 2017 3:50:56 PM EDT Roy Golan wrote: > > >>>> > So somewhere in the code somebody used the Arch and not the > family. > > >>>> > > >>>> See the > > >>>> > > >>>> > enum getFamily() method > > >>>> > > >>>> Yep, in particular line 23 of FeatureSupported.java. > > >>>> > > >>>> I meant the caller of the method on this line. Do you have it in the > > >>> > > >>> trace so we can see who passed x86_64 as arch ? > > >>> > > >>> > On Mon, 25 Sep 2017 at 22:31 Alexander Wels <[email protected]> > wrote: > > >>>> > > On Monday, September 25, 2017 3:24:14 PM EDT Roy Golan wrote: > > >>>> > > > what JRE are you using? any change with that? > > >>>> > > > > >>>> > > So I just figured out the problem, and its really strange. It > has > > >>>> > > >>>> nothing > > >>>> > > >>>> > > to > > >>>> > > do with the SSL as the stack trace is mentioning. I manually > > >>>> > > stepped > > >>>> > > through > > >>>> > > the code to see what was going on and it turns out it is > failing in > > >>>> > > FeatureSupported.java in supportedInConfig call from > hotPlugMemory. > > >>>> > > > > >>>> > > The Config.<Map>getValue(feature, version.getValue()) (version > is > > >>>> > > >>>> 4.2) is > > >>>> > > >>>> > > returning a map containing x86=true and ppc=true. But then it > > >>>> > > >>>> compares > > >>>> > > >>>> > > this to > > >>>> > > ArchitectureType.name() it returns null, because .name() return > > >>>> > > >>>> x86_64. No > > >>>> > > >>>> > > it > > >>>> > > appears that sometime during the last few months we dropped the > _64 > > >>>> > > >>>> in the > > >>>> > > >>>> > > ArchitectureType, or at least in the database. > > >> > > >> It looks a lot like introduced here: https://gerrit.ovirt.org/#/c/ > 81464/ > > >> > > >> @Mirka: what you think? > > >> > > >>>> > > As soon as I added a vdc_options tha contains x86_64 value for > that > > >>>> > > >>>> key it > > >>>> > > >>>> > > started working. Now I have checked with Greg who has a fresh > > >>>> > > >>>> database > > >>>> > > >>>> > > that he > > >>>> > > can start VMs no problem, and his database contains x86 instead > of > > >>>> > > >>>> x86_64. > > >>>> > > >>>> > > > On Mon, 25 Sep 2017 at 21:12 Alexander Wels <[email protected] > > > > >>>> > > >>>> wrote: > > >>>> > > > > Hi guys, > > >>>> > > > > > > >>>> > > > > I see to be having an issue starting VMs with the latest > > >>>> > > > > master. > > >>>> > > > > >>>> > > Whenever > > >>>> > > > > >>>> > > > > I > > >>>> > > > > try to start a VM I get null pointer exception. And the VM > > >>>> > > >>>> doesn't > > >>>> > > >>>> > > start. > > >>>> > > > > >>>> > > > > I > > >>>> > > > > have debugged the engine, and it appears that the null > pointer > > >>>> > > >>>> happens > > >>>> > > >>>> > > > > after > > >>>> > > > > the engine tries to connect to the host. In the stack trace > I > > >>>> > > >>>> see > > >>>> > > >>>> > > > > SSLPeerUnverifiedException, so it appears something went > wrong > > >>>> > > >>>> with a > > >>>> > > >>>> > > > > certificate somewhere. > > >>>> > > > > > > >>>> > > > > I have put my hosts in maintaince and re-enrolled the > > >>>> > > >>>> certificate, but > > >>>> > > >>>> > > > > that > > >>>> > > > > doesn't appear to be helping at all. Any other place I need > to > > >>>> > > >>>> look at > > >>>> > > >>>> > > to > > >>>> > > > > >>>> > > > > make > > >>>> > > > > sure the engine can talk to the hosts? This appears to have > > >>>> > > >>>> started > > >>>> > > >>>> > > after > > >>>> > > > > >>>> > > > > I > > >>>> > > > > upgraded Wildfly to 11, so it is possible it has something > to > > >>>> > > >>>> do with > > >>>> > > >>>> > > that > > >>>> > > > > >>>> > > > > as > > >>>> > > > > well. > > >>>> > > > > > > >>>> > > > > Any help figuring this out would be appreciated. > > >>>> > > > > > > >>>> > > > > Alexander > > >>>> > > > > _______________________________________________ > > >>>> > > > > Devel mailing list > > >>>> > > > > [email protected] > > >>>> > > > > http://lists.ovirt.org/mailman/listinfo/devel > > >>> > > >>> _______________________________________________ > > >>> Devel mailing list > > >>> [email protected] > > >>> http://lists.ovirt.org/mailman/listinfo/devel > > >
_______________________________________________ Devel mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/devel
