Well, it depends on your definition of the boundaries of managix. What I
did is that I added an RMI object in the InstallerDriver which basically
listen for state changes from the cluster controller. This means some
additional logic in the CCApplicationEntryPoint where after the CC is
ready, it contacts the InstallerDriver using RMI and at that point only,
the InstallerDriver can return to managix and tells it that the startup is
complete.

Not sure if this is the right way to do it but it definitely is better than
what we currently have.
Abdullah.

On Mon, Aug 24, 2015 at 10:00 PM, Chris Hillery <[email protected]>
wrote:

> Hopefully the solution won't involve additional important logic inside
> Managix itself?
>
> Ceej
> aka Chris Hillery
>
> On Mon, Aug 24, 2015 at 7:26 AM, abdullah alamoudi <[email protected]>
> wrote:
>
> > That works but it doesn't feel right doing it this way. I am going to fix
> > this one for good.
> >
> > Cheers,
> > Abdullah.
> >
> > On Mon, Aug 24, 2015 at 5:11 PM, Ian Maxon <[email protected]> wrote:
> >
> > > The way I assured liveness for the YARN installer was to try running
> "for
> > > $x in dataset Metadata.Dataset return $x" via the API. I just polled
> for
> > a
> > > reasonable amount of time  (though honestly, thinking about it now, the
> > > correct parameter to use for the polling interval is the startup wait
> > time
> > > in the parameters file :) ). It's not perfect, but it gives less false
> > > positives than just checking ps for processes that look like CCs/NCs.
> > >
> > > - Ian.
> > >
> > > On Mon, Aug 24, 2015 at 5:03 AM, abdullah alamoudi <[email protected]
> >
> > > wrote:
> > >
> > > > Now that I think about it. Maybe we should provide multiple ways to
> do
> > > > this. A polling mechanism to be used for arbitrary time and a pushing
> > > > mechanism on startup.
> > > > I am going to start implementation of this and will probably use RMI
> > for
> > > > this task both ways (CC to InstallerDriver and InstallerDriver to
> CC).
> > > >
> > > > Cheers,
> > > > Abdullah.
> > > >
> > > > On Mon, Aug 24, 2015 at 2:19 PM, abdullah alamoudi <
> [email protected]
> > >
> > > > wrote:
> > > >
> > > > > So after further investigation, turned out our startup process just
> > > > starts
> > > > > the CC and NC processes and then make sure the processes are
> running
> > > and
> > > > if
> > > > > the processes were found to be running, it returns the state of the
> > > > cluster
> > > > > to be active and the subsequent test commands can start
> immediately.
> > > > >
> > > > > This means that the CC could've started but is not yet ready when
> we
> > > try
> > > > > to process the next command. To address this, we need a better way
> to
> > > > tell
> > > > > when the startup procedure has completed. we can do this by pushing
> > (CC
> > > > > informs installer driver when the startup is complete) or polling
> > (The
> > > > > installer driver needs to actually query the CC for the state of
> the
> > > > > cluster).
> > > > >
> > > > > I can do either way so let's vote. My vote goes to the pushing
> > > mechanism.
> > > > > Thoughts?
> > > > >
> > > > > On Mon, Aug 24, 2015 at 10:15 AM, abdullah alamoudi <
> > > [email protected]>
> > > > > wrote:
> > > > >
> > > > >> This solution turned out to be incorrect. Actually, the test cases
> > > when
> > > > I
> > > > >> build after using the join method never fails but running an
> actual
> > > > asterix
> > > > >> instance never succeeds which is quite confusing.
> > > > >>
> > > > >> I also think that the startup script has a major bug where it
> might
> > > > >> returns before the startup is complete. More on this later......
> > > > >>
> > > > >> On Mon, Aug 24, 2015 at 7:48 AM, abdullah alamoudi <
> > > [email protected]>
> > > > >> wrote:
> > > > >>
> > > > >>> It is highly unlikely that it is related.
> > > > >>>
> > > > >>> Cheers,
> > > > >>> Abdullah.
> > > > >>>
> > > > >>> On Mon, Aug 24, 2015 at 5:45 AM, Chen Li <[email protected]>
> wrote:
> > > > >>>
> > > > >>>> @Abdullah: Is this issue related to
> > > > >>>> https://issues.apache.org/jira/browse/ASTERIXDB-1074? Ian and I
> > > plan
> > > > to
> > > > >>>> look into the details on Monday.
> > > > >>>>
> > > > >>>> On Sun, Aug 23, 2015 at 10:08 AM, abdullah alamoudi <
> > > > [email protected]
> > > > >>>> >
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>> > About 3-4 days ago, I was working on the addition of the
> > > filesystem
> > > > >>>> based
> > > > >>>> > feed adapter and it didn't take anytime to complete. However,
> > > when I
> > > > >>>> wanted
> > > > >>>> > to build and make sure all tests pass, I kept getting
> > > > >>>> ConnectionRefused
> > > > >>>> > errors which caused the installer tests to fail every now and
> > > then.
> > > > >>>> >
> > > > >>>> > I knew the new change had nothing to do with this failure,
> yet,
> > I
> > > > >>>> couldn't
> > > > >>>> > direct my attention away from this bug (It just bothered me so
> > > much
> > > > >>>> and I
> > > > >>>> > knew it needs to be resolved ASAP). After wasting countless
> > > hours, I
> > > > >>>> was
> > > > >>>> > finally able to figure out what was happening :-)
> > > > >>>> >
> > > > >>>> > In the startup routine, we start three Jetty web servers (Web
> > > > >>>> interface
> > > > >>>> > server, JSON API server, and Feed server). Sometime ago, we
> used
> > > to
> > > > >>>> end the
> > > > >>>> > startup call before making sure the server.isStarted() method
> > > > returns
> > > > >>>> true
> > > > >>>> > on all servers. At that time, I introduced the
> > > waitUntilServerStarts
> > > > >>>> method
> > > > >>>> > to make sure we don't return before the servers are ready.
> > Turned
> > > > >>>> out, that
> > > > >>>> > was an incorrect way to handle this (We can blame
> stackoverflow
> > > for
> > > > >>>> this
> > > > >>>> > one!) and it is not enough that the server isStarted() returns
> > > true.
> > > > >>>> The
> > > > >>>> > correct way to do this is to call the server.join() method
> after
> > > the
> > > > >>>> > server.start().
> > > > >>>> >
> > > > >>>> > See:
> > > > >>>> >
> > > > >>>>
> > > >
> > >
> >
> http://stackoverflow.com/questions/15924874/embedded-jetty-why-to-use-join
> > > > >>>> >
> > > > >>>> > This was equally satisfying as it was frustrating and you are
> > > > welcome
> > > > >>>> for
> > > > >>>> > the future time I saved each of you :)
> > > > >>>> > --
> > > > >>>> > Amoudi, Abdullah.
> > > > >>>> >
> > > > >>>>
> > > > >>>
> > > > >>>
> > > > >>>
> > > > >>> --
> > > > >>> Amoudi, Abdullah.
> > > > >>>
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Amoudi, Abdullah.
> > > > >>
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Amoudi, Abdullah.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Amoudi, Abdullah.
> > > >
> > >
> >
> >
> >
> > --
> > Amoudi, Abdullah.
> >
>



-- 
Amoudi, Abdullah.

Reply via email to