Hi Eric,
I believe I have most of steps 1-5 working. Data from "/usr/bin/df" is
being collected, parsed, stuck into HDFS, and then pulled out again and
placed into MySQL. However, HICC isn't showing me my data just yet...
The disk_2098_week table is filled out with several entries and looks
great. If I select my cluster from the "Cluster Selector" and "Last 12
Hours" from the "Time" widget, the "Disk Statistics" widget still says
"No Data available."
It appears to be because part of the SQL query includes the host name
which is coming across in the SQL parameters as "". However, since the
disk_2098_week table properly includes the host name, nothing is
returned by the query. Just for grins, I updated the table manually in
MySQL to blank out the host names and I get a super cool, pretty graph
(which looks great, BTW).
Additionally, if I select other time periods such as "Last 1 Hour", I
see the query is using UTC or something (at 1:00 PDT, I see the query is
using a range of 19:00-20:00). However, the data in MySQL is based on
PDT, so no matches are found. It appears that the "time_zone" session
attribute contains the value "UTC". Where is this coming from and how
can I change it?
Problems:
1. How do I get the "Hosts Selector" in HICC to include my host name
so that the generated SQL queries are correct?
2. How do I make the "time_zone" session parameter use PDT vs. UTC?
3. How do I populate the other tables, such as "disk_489_month"?
Thanks,
Kirk
Eric Yang wrote:
Df command is converted into disk_xxxx_week table in mysql, if I remember
correctly. In mysql are the database tables getting created?
Make sure that you have:
<property>
<name>chukwa.post.demux.data.loader</name>
<value>org.apache.hadoop.chukwa.dataloader.MetricDataLoaderPool,org.apache.h
adoop.chukwa.dataloader.FSMDataLoader</value>
</property>
In Chukwa-demux.conf.
The rough picture of the data flows looks like this:
1. demux -> Generate chukwa record outputs.
2. archive -> Generate bigger files by compacting data sink files.
(Concurrent with step 1)
3. postProcess -> Look up what files are generated by demux process and
dispatch using different data loaders.
4. MetricDataLoaderPool -> Dispatch multiple threads to load chukwa
record files to different MDL.
5. MetricDataLoader -> Load sequence file to database by record type
defined in mdl.xml.
6. HICC widget has a descriptor language in json. You can find the widget
descriptor files in hdfs://namenode:port/chukwa/hicc/widgets which
embedded the full SQL template like:
Query=²select cpu_user_pcnt from [system_metrics] where timestamp between
[start] and [end]²
This will output everything the metrics in JSON format and the HICC
graphing widget will render the graph.
If there is no data, look at postProcess.log and make sure the data loading
is not throwing exceptions. Step 3 to 6 are deprecated, and will be
replaced with something else. Hope this helps.
Regards,
Eric
On 3/17/10 4:16 PM, "Kirk True" <k...@mustardgrain.com> wrote:
Hi Eric,
Eric Yang wrote:
Hi Kirk,
I am working on a design which removes MySQL from Chukwa. I am making this
departure from MySQL because MDL framework was for prototype purpose. It
will not scale in production system where Chukwa could be host on large
hadoop cluster. HICC will serve data directly from HDFS in the future.
Meanwhile, the dbAdmin.sh from Chukwa 0.3 is still compatible with trunk
version of Chukwa. You can load ChukwaRecords using
org.apache.hadoop.chukwa.dataloader.MetricDataLoader class or mdl.sh from
Chukwa 0.3.
I'm to the point where the "df" example is working and demux is storing
ChukwaRecord data in HDFS. When I run dbAdmin.sh from 0.3.0, no data is
getting updated in the database.
My question is: what's the process to get a custom Demux implementation to be
viewable in HICC? Are the database tables magically created and populated for
me? Does HICC generate a widget for me?
HICC looks very nice, but when I try to add a widget to my dashboard, the
preview always reads, "No Data Available." I'm running
$CHUKWA_HOME/bin/start-all.sh followed by $CHUKWA_HOME/bin/dbAdmin.sh (which
I've manually copied to the bin directory).
What am I missing?
Thanks,
Kirk
MetricDataLoader class will be mark as deprecated, and it will not be
supported once we make transition to Avro + Tfile.
Regards,
Eric
On 3/15/10 11:56 AM, "Kirk True" <k...@mustardgrain.com>
<mailto:k...@mustardgrain.com> wrote:
Hi all,
I recently switched to trunk as I was experiencing a lot of issues with
0.3.0. In 0.3.0, there was a dbAdmin.sh script that would run and try to
stick data in MySQL from HDFS. However, that script is gone and when I
run the system as built from trunk, nothing is ever populated in the
database. Where are the instructions for setting up the HDFS -> MySQL
data migration for HICC?
Thanks,
Kirk