Re: A couple of anomalies

Roger Lloyd Sat, 14 Jul 2012 18:21:31 -0700

Yeah, so it seems that our number one mistake is taking the Master down in
response to having issues.  I guess you get so comfortable bringing the
cluster up and down when you are first starting that it seems like a
natural knee jerk reaction.  This most recent time there was something in
yellow/red, but I don't recall what it said and it didn't seem to make
sense to me, so since I was having problems with the web console and not
sure the actual state of the Master, I just tried to stop it.  When it
pushed back on shutting down (running stop-all.sh) something about access
denied, I cancelled out of the shutdown script - so who knows on where it
ended up.


Could you explain a little more about the Master's monitoring console?  It
runs an embedded Jetty instance and renders data from JMX MBeans from the
running Master?  I know there is an XML representation, and I thought I saw
something about embedding it in a separate JMX console (or maybe it is
blurring with my read on the ZK and Hadoop reading), but is there a data
store that holds that data, is it accessible by some other means if the web
console isn't responding?

On Sat, Jul 14, 2012 at 8:09 PM, Eric Newton <[email protected]> wrote:

> Is there anything red  or yellow on the monitor pages?
>
> There's a layering to availability:
>
> Most of the monitoring is done via the master, so if it has recently
> restarted, you will see almost no useful information.
>
> The first tablet of the METADATA table needs to be assigned, recovered
> and functional.  If you see only one tablet assigned... it needs to be
> healthy before anything else can happen.
>
> Next, the rest of the METADATA table needs to be assigned, recovered
> and functional.
>
> If you are seeing "-" then the METADATA table is not available for some
> reason.
>
> Ensure that hadoop & zookeeper are not using /tmp for storage.
>
> -Eric
>
> On Sat, Jul 14, 2012 at 7:18 PM, Roger Lloyd <[email protected]>
> wrote:
> > I was looking for some insights in regards to a couple of issues I have
> > seen, and the likely cause/solution.
> >
> > 1)  Tables go blank
> >
> > So, everything kicking along fine, I am loading data, works beautifully
> for
> > days even weeks adding hundreds of millions of entries, splitting
> tablets,
> > etc. - just smooth.  Suddenly, I run into an issue where under the web
> > console all the tables all just go to "-" for their values (except the
> > !METADATA table).
> >
> > What could/would cause this?
> >
> > What is the smart way to react?  Our previous attempts have been 1)
> re-init
> > and reload through the Client API and 2) re-init and recover the tables
> > using the bulk loading scheme mentioned in this mailing list.  Not sure
> that
> > we haven't taken more rash action than necessary, simply because we could
> > afford to reload, etc.  When we increase our deployment, that will be
> less
> > of an option.  Not sure what we are doing something wrong overall.
> >
> > 2) Client connections to Zookeeper
> >
> > When I am writing a client in Eclipse, we seem to have this issue where
> it
> > cycles connections creating and closing sessions (with no errors at all),
> > but if I suspend the thread in Eclipse and start it again, then the
> session
> > opens and stays open.  I realize this is probably a Zookeeper problem,
> but
> > can someone give me a quick run down of what is happening there under the
> > hood, so I could try running some zKCli commands to simulate the issue?
> >
> > We are running version: 1.4.0-incubating-SNAPSHOT and Zookeeper 3.4.3.
>  If
> > we wanted to upgrade to 1.4.1, how involved would that be?  Just replace
> the
> > jar files and the config files?  Or would we need to migrate data?
> >
> > Thanks for your help.
> >
> > Roger
>

Re: A couple of anomalies

Reply via email to